idna.h: GNU Libidn API Reference Manual

idna.h

idna.h — IDNA-related functions

Functions

const char *	idna_strerror ()
int	idna_to_ascii_4i ()
int	idna_to_unicode_44i ()
int	idna_to_ascii_4z ()
int	idna_to_ascii_8z ()
int	idna_to_ascii_lz ()
int	idna_to_unicode_4z4z ()
int	idna_to_unicode_8z4z ()
int	idna_to_unicode_8z8z ()
int	idna_to_unicode_8zlz ()
int	idna_to_unicode_lzlz ()

Types and Values

#define	IDNAPI
enum	Idna_rc
enum	Idna_flags
#define	IDNA_ACE_PREFIX

Description

IDNA-related functions.

Functions

idna_strerror ()

const char *
idna_strerror (Idna_rc rc);

Convert a return code integer to a text string. This string can be used to output a diagnostic message to the user.

IDNA_SUCCESS: Successful operation. This value is guaranteed to always be zero, the remaining ones are only guaranteed to hold non-zero values, for logical comparison purposes. IDNA_STRINGPREP_ERROR: Error during string preparation. IDNA_PUNYCODE_ERROR: Error during punycode operation. IDNA_CONTAINS_NON_LDH: For IDNA_USE_STD3_ASCII_RULES, indicate that the string contains non-LDH ASCII characters. IDNA_CONTAINS_MINUS: For IDNA_USE_STD3_ASCII_RULES, indicate that the string contains a leading or trailing hyphen-minus (U+002D). IDNA_INVALID_LENGTH: The final output string is not within the (inclusive) range 1 to 63 characters. IDNA_NO_ACE_PREFIX: The string does not contain the ACE prefix (for ToUnicode). IDNA_ROUNDTRIP_VERIFY_ERROR: The ToASCII operation on output string does not equal the input. IDNA_CONTAINS_ACE_PREFIX: The input contains the ACE prefix (for ToASCII). IDNA_ICONV_ERROR: Character encoding conversion error. IDNA_MALLOC_ERROR: Could not allocate buffer (this is typically a fatal error). IDNA_DLOPEN_ERROR: Could not dlopen the libcidn DSO (only used internally in libc).

Parameters

an Idna_rc return code.

Returns

Returns a pointer to a statically allocated string containing a description of the error with the return code rc .

idna_to_ascii_4i ()

int
idna_to_ascii_4i (const uint32_t *in,
                  size_t inlen,
                  char *out,
                  int flags);

The ToASCII operation takes a sequence of Unicode code points that make up one domain label and transforms it into a sequence of code points in the ASCII range (0..7F). If ToASCII succeeds, the original sequence and the resulting sequence are equivalent labels.

It is important to note that the ToASCII operation can fail. ToASCII fails if any step of it fails. If any step of the ToASCII operation fails on any label in a domain name, that domain name MUST NOT be used as an internationalized domain name. The method for deadling with this failure is application-specific.

The inputs to ToASCII are a sequence of code points, the AllowUnassigned flag, and the UseSTD3ASCIIRules flag. The output of ToASCII is either a sequence of ASCII code points or a failure condition.

ToASCII never alters a sequence of code points that are all in the ASCII range to begin with (although it could fail). Applying the ToASCII operation multiple times has exactly the same effect as applying it just once.

Parameters

in	input array with unicode code points.
inlen	length of input array with unicode code points.
out	output zero terminated string that must have room for at least 63 characters plus the terminating zero.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns 0 on success, or an Idna_rc error code.

idna_to_unicode_44i ()

int
idna_to_unicode_44i (const uint32_t *in,
                     size_t inlen,
                     uint32_t *out,
                     size_t *outlen,
                     int flags);

The ToUnicode operation takes a sequence of Unicode code points that make up one domain label and returns a sequence of Unicode code points. If the input sequence is a label in ACE form, then the result is an equivalent internationalized label that is not in ACE form, otherwise the original sequence is returned unaltered.

ToUnicode never fails. If any step fails, then the original input sequence is returned immediately in that step.

The Punycode decoder can never output more code points than it inputs, but Nameprep can, and therefore ToUnicode can. Note that the number of octets needed to represent a sequence of code points depends on the particular character encoding used.

The inputs to ToUnicode are a sequence of code points, the AllowUnassigned flag, and the UseSTD3ASCIIRules flag. The output of ToUnicode is always a sequence of Unicode code points.

Parameters

in	input array with unicode code points.
inlen	length of input array with unicode code points.
out	output array with unicode code points.
outlen	on input, maximum size of output array with unicode code points, on exit, actual size of output array with unicode code points.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns Idna_rc error condition, but it must only be used for debugging purposes. The output buffer is always guaranteed to contain the correct data according to the specification (sans malloc induced errors). NB! This means that you normally ignore the return code from this function, as checking it means breaking the standard.

idna_to_ascii_4z ()

int
idna_to_ascii_4z (const uint32_t *input,
                  char **output,
                  int flags);

Convert UCS-4 domain name to ASCII string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero terminated input Unicode string.
output	pointer to newly allocated output string.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_ascii_8z ()

int
idna_to_ascii_8z (const char *input,
                  char **output,
                  int flags);

Convert UTF-8 domain name to ASCII string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero terminated input UTF-8 string.
output	pointer to newly allocated output string.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_ascii_lz ()

int
idna_to_ascii_lz (const char *input,
                  char **output,
                  int flags);

Convert domain name in the locale's encoding to ASCII string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero terminated input string encoded in the current locale's character set.
output	pointer to newly allocated output string.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_unicode_4z4z ()

int
idna_to_unicode_4z4z (const uint32_t *input,
                      uint32_t **output,
                      int flags);

Convert possibly ACE encoded domain name in UCS-4 format into a UCS-4 string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero-terminated Unicode string.
output	pointer to newly allocated output Unicode string.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_unicode_8z4z ()

int
idna_to_unicode_8z4z (const char *input,
                      uint32_t **output,
                      int flags);

Convert possibly ACE encoded domain name in UTF-8 format into a UCS-4 string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero-terminated UTF-8 string.
output	pointer to newly allocated output Unicode string.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_unicode_8z8z ()

int
idna_to_unicode_8z8z (const char *input,
                      char **output,
                      int flags);

Convert possibly ACE encoded domain name in UTF-8 format into a UTF-8 string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero-terminated UTF-8 string.
output	pointer to newly allocated output UTF-8 string.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_unicode_8zlz ()

int
idna_to_unicode_8zlz (const char *input,
                      char **output,
                      int flags);

Convert possibly ACE encoded domain name in UTF-8 format into a string encoded in the current locale's character set. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero-terminated UTF-8 string.
output	pointer to newly allocated output string encoded in the current locale's character set.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

idna_to_unicode_lzlz ()

int
idna_to_unicode_lzlz (const char *input,
                      char **output,
                      int flags);

Convert possibly ACE encoded domain name in the locale's character set into a string encoded in the current locale's character set. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.

Parameters

input	zero-terminated string encoded in the current locale's character set.
output	pointer to newly allocated output string encoded in the current locale's character set.
flags	an Idna_flags value, e.g., `IDNA_ALLOW_UNASSIGNED` or `IDNA_USE_STD3_ASCII_RULES`.

Returns

Returns IDNA_SUCCESS on success, or error code.

Types and Values

IDNAPI

#define             IDNAPI

Symbol holding shared library API visibility decorator.

This is used internally by the library header file and should never be used or modified by the application.

https://www.gnu.org/software/gnulib/manual/html_node/Exported-Symbols-of-Shared-Libraries.html

enum Idna_rc

Enumerated return codes of idna_to_ascii_4i(), idna_to_unicode_44i() functions (and functions derived from those functions). The value 0 is guaranteed to always correspond to success.

Members

IDNA_SUCCESS	Successful operation. This value is guaranteed to always be zero, the remaining ones are only guaranteed to hold non-zero values, for logical comparison purposes.
IDNA_STRINGPREP_ERROR	Error during string preparation.
IDNA_PUNYCODE_ERROR	Error during punycode operation.
IDNA_CONTAINS_NON_LDH	For IDNA_USE_STD3_ASCII_RULES, indicate that the string contains non-LDH ASCII characters.
IDNA_CONTAINS_LDH	Same as `IDNA_CONTAINS_NON_LDH` , for compatibility with typo in earlier versions.
IDNA_CONTAINS_MINUS	For IDNA_USE_STD3_ASCII_RULES, indicate that the string contains a leading or trailing hyphen-minus (U+002D).
IDNA_INVALID_LENGTH	The final output string is not within the (inclusive) range 1 to 63 characters.
IDNA_NO_ACE_PREFIX	The string does not contain the ACE prefix (for ToUnicode).
IDNA_ROUNDTRIP_VERIFY_ERROR	The ToASCII operation on output string does not equal the input.
IDNA_CONTAINS_ACE_PREFIX	The input contains the ACE prefix (for ToASCII).
IDNA_ICONV_ERROR	Character encoding conversion error.
IDNA_MALLOC_ERROR	Could not allocate buffer (this is typically a fatal error).
IDNA_DLOPEN_ERROR	Could not dlopen the libcidn DSO (only used internally in libc).

enum Idna_flags

Flags to pass to idna_to_ascii_4i(), idna_to_unicode_44i() etc.

Members

IDNA_ALLOW_UNASSIGNED	Don't reject strings containing unassigned Unicode code points.
IDNA_USE_STD3_ASCII_RULES	Validate strings according to STD3 rules (i.e., normal host name rules).

IDNA_ACE_PREFIX

#  define IDNA_ACE_PREFIX "xn--"

The IANA allocated prefix to use for IDNA. "xn--"