Next: , Previous: , Up: GNU libunistring   [Contents][Index]


12 Line breaking <unilbrk.h>

This include file declares functions for determining where in a string line breaks could or should be introduced, in order to make the displayed string fit into a column of given width.

These functions are locale dependent. The encoding argument identifies the encoding (e.g. "ISO-8859-2" for Polish).

The following enumerated values indicate whether, at a given position, a line break is possible or not. Given an string s as an array s[0..n-1] and a position i, the values have the following meanings:

Constant: int UC_BREAK_MANDATORY

This value indicates that s[i] is a line break character.

Constant: int UC_BREAK_CR_BEFORE_LF

This value is a variant of UC_BREAK_MANDATORY. It indicates that s[i] is a CR character and that s[i+1] is a LF character.

Constant: int UC_BREAK_POSSIBLE

This value indicates that a line break may be inserted between s[i-1] and s[i].

Constant: int UC_BREAK_HYPHENATION

This value indicates that a hyphen and a line break may be inserted between s[i-1] and s[i]. But beware of language dependent hyphenation rules.

Constant: int UC_BREAK_PROHIBITED

This value indicates that s[i-1] and s[i] must not be separated.

Constant: int UC_BREAK_UNDEFINED

This value is not used as a return value; rather, in the overriding argument of the u*_width_linebreaks functions, it indicates the absence of an override.

The following functions determine the positions at which line breaks are possible.

Function: void u8_possible_linebreaks (const uint8_t *s, size_t n, const char *encoding, char *p)
Function: void u16_possible_linebreaks (const uint16_t *s, size_t n, const char *encoding, char *p)
Function: void u32_possible_linebreaks (const uint32_t *s, size_t n, const char *encoding, char *p)
Function: void ulc_possible_linebreaks (const char *s, size_t n, const char *encoding, char *p)

Determines the line break points in s, and stores the result at p[0..n-1]. Every p[i] is assigned one of the values UC_BREAK_MANDATORY, UC_BREAK_CR_BEFORE_LF, UC_BREAK_POSSIBLE, UC_BREAK_HYPHENATION, UC_BREAK_PROHIBITED.

The following functions determine where line breaks should be inserted so that each line fits in a given width, when output to a device that uses non-proportional fonts.

Function: int u8_width_linebreaks (const uint8_t *s, size_t n, int width, int start_column, int at_end_columns, const char *override, const char *encoding, char *p)
Function: int u16_width_linebreaks (const uint16_t *s, size_t n, int width, int start_column, int at_end_columns, const char *override, const char *encoding, char *p)
Function: int u32_width_linebreaks (const uint32_t *s, size_t n, int width, int start_column, int at_end_columns, const char *override, const char *encoding, char *p)
Function: int ulc_width_linebreaks (const char *s, size_t n, int width, int start_column, int at_end_columns, const char *override, const char *encoding, char *p)

Chooses the best line breaks, assuming that every character occupies a width given by the uc_width function (see Display width <uniwidth.h>).

The string is s[0..n-1].

The maximum number of columns per line is given as width. The starting column of the string is given as start_column. If the algorithm shall keep room after the last piece, this amount of room can be given as at_end_columns.

override is an optional override; if override[i] != UC_BREAK_UNDEFINED, override[i] takes precedence over p[i] as returned by the u*_possible_linebreaks function.

The given encoding is used for disambiguating widths in uc_width.

Returns the column after the end of the string, and stores the result at p[0..n-1]. Every p[i] is assigned one of the values UC_BREAK_MANDATORY, UC_BREAK_CR_BEFORE_LF, UC_BREAK_POSSIBLE, UC_BREAK_HYPHENATION, UC_BREAK_PROHIBITED. Here the value UC_BREAK_POSSIBLE indicates that a line break should be inserted.


Next: Normalization forms (composition and decomposition) <uninorm.h>, Previous: Word breaks in strings <uniwbrk.h>, Up: GNU libunistring   [Contents][Index]