Next: Output with Unicode strings <unistdio.h>
, Previous: Elementary Unicode string functions <unistr.h>
, Up: GNU libunistring [Contents][Index]
<uniconv.h>
This include file declares functions for converting between Unicode strings
and char *
strings in locale encoding or in other specified encodings.
The following function returns the locale encoding.
Determines the current locale’s character encoding, and canonicalizes it into one of the canonical names listed in localcharset.h. If the canonical name cannot be determined, the result is a non-canonical name.
The result must not be freed; it is statically allocated.
The result of this function can be used as an argument to the iconv_open
function in GNU libc, in GNU libiconv, or in the gnulib provided wrapper
around the native iconv_open
function. It may not work as an argument
to the native iconv_open
function directly.
The handling of unconvertible characters during the conversions can be parametrized through the following enumeration type:
This type specifies how unconvertible characters in the input are handled.
This handler causes the function to return with errno
set to
EILSEQ
.
This handler produces one question mark ‘?’ per unconvertible character.
This handler produces one U+FFFD per unconvertible character if that fits in the target encoding, otherwise one question mark ‘?’ per unconvertible character.
This handler produces an escape sequence \uxxxx
or
\Uxxxxxxxx
for each unconvertible character.
The following functions convert between strings in a specified encoding and Unicode strings.
Converts an entire string, possibly including NUL bytes, from one encoding to UTF-8 encoding.
Converts a memory region given in encoding fromcode. fromcode is
as for the iconv_open
function.
The input is in the memory region between src (inclusive) and
src + srclen
(exclusive).
If offsets is not NULL, it should point to an array of srclen
integers; this array is filled with offsets into the result, i.e. the
character starting at src[i]
corresponds to the character starting
at result[offsets[i]]
, and other offsets are set to
(size_t)(-1)
.
resultbuf
and *lengthp
should be a scratch
buffer and its size, or resultbuf
can be NULL.
May erase the contents of the memory at resultbuf
.
If successful: The resulting Unicode string (non-NULL) is returned and
its length stored in *lengthp
. The resulting string is
resultbuf
if no dynamic memory allocation was necessary,
or a freshly allocated memory block otherwise.
In case of error: NULL is returned and errno
is set.
Particular errno
values: EINVAL
, EILSEQ
, ENOMEM
.
Converts an entire Unicode string, possibly including NUL units, from UTF-8 encoding to a given encoding.
Converts a memory region to encoding tocode. tocode is as for
the iconv_open
function.
The input is in the memory region between src (inclusive) and
src + srclen
(exclusive).
If offsets is not NULL, it should point to an array of srclen
integers; this array is filled with offsets into the result, i.e. the
character starting at src[i]
corresponds to the character starting
at result[offsets[i]]
, and other offsets are set to
(size_t)(-1)
.
resultbuf
and *lengthp
should be a scratch
buffer and its size, or resultbuf
can be NULL.
May erase the contents of the memory at resultbuf
.
If successful: The resulting Unicode string (non-NULL) is returned and
its length stored in *lengthp
. The resulting string is
resultbuf
if no dynamic memory allocation was necessary,
or a freshly allocated memory block otherwise.
In case of error: NULL is returned and errno
is set.
Particular errno
values: EINVAL
, EILSEQ
, ENOMEM
.
The following functions convert between NUL terminated strings in a specified encoding and NUL terminated Unicode strings.
Converts a NUL terminated string from a given encoding.
The result is malloc
allocated, or NULL (with errno set) in case of error.
Particular errno
values: EILSEQ
, ENOMEM
.
Converts a NUL terminated string to a given encoding.
The result is malloc
allocated, or NULL (with errno
set) in case of error.
Particular errno
values: EILSEQ
, ENOMEM
.
The following functions are shorthands that convert between NUL terminated strings in locale encoding and NUL terminated Unicode strings.
Converts a NUL terminated string from the locale encoding.
The result is malloc
allocated, or NULL (with errno
set) in case of error.
Particular errno
values: ENOMEM
.
Converts a NUL terminated string to the locale encoding.
The result is malloc
allocated, or NULL (with errno
set) in case of error.
Particular errno
values: ENOMEM
.
Next: Output with Unicode strings <unistdio.h>
, Previous: Elementary Unicode string functions <unistr.h>
, Up: GNU libunistring [Contents][Index]