texi2any
writes fixed strings into the output document at
various places: cross-references, page footers, the help page,
alternate text for images, and so on. The string chosen depends on
the value of the documentlanguage
at the time of the string
being output (see @documentlanguage ll[_cc]
: Set the Document Language, for the Texinfo
command interface).
The Gettext framework is used for those strings (see Gettext). The libintl-perl
package is used as the
gettext
implementation; more specifically, the pure Perl
implementation is used, so Texinfo can support consistent behavior
across all platforms and installations, which would not otherwise be
possible. libintl-perl
is included in the Texinfo distribution
and always installed, to ensure that it is available if needed. It is
also possible to use the system gettext
(the choice can be made
at build-time).
The Gettext domain ‘texinfo_document’ is used for the strings.
Translated strings are written as Texinfo, and may include
@-commands. In translated strings, the varying parts of the string
are not usually denoted by %s
and the like, but by
‘{arg_name}’. (This convention is common for gettext
in
Perl and is fully supported in GNU Gettext; see Perl
Format Strings in GNU Gettext.) For example, in the
following, ‘{section}’ will be replaced by the section name:
see {section}
These Perl-style brace format strings are used for two reasons: first,
changing the order of printf
arguments is only available since
Perl 5.8.0; second, and more importantly, the order of arguments
is unpredictable, since @-command expansion may lead to different
orders depending on the output format.
The expansion of a translation string is done like this:
.
documentencoding.
If the documentlanguage has the form ‘ll_CC’, that is tried first, and then just ‘ll’.
To cope with the possibility of having multiple encodings, a
special use of the us-ascii
locale encoding is also possible.
If the ‘ll’ locale in the current encoding does not exist, and the
encoding is not us-ascii
, then us-ascii
is tried.
The idea is that if there is a us-ascii
encoding, it means that
all the characters in the charset may be expressed as @-commands.
For example, there is a fr.us-ascii
locale that can accommodate
any encoding, since all the Latin 1 characters have associated
@-commands. On the other hand, Japanese has only a translation
ja.utf-8
, since there are no @-commands for Japanese
characters.
The us-ascii
locales are not needed much now that
UTF-8 is used for most documents. Note that accented characters
are required to be expressed as @-commands in the us-ascii
locales,
which may be inconvenient for translators.
In the following example, ‘{date}’, ‘{program_homepage}’
and ‘{program}’ are the arguments of the string. Since they
are used in @uref
, their order is not predictable.
‘{date}’, ‘{program_homepage}’ and ‘{program}’ are
substituted after the expansion:
Generated on @emph{{date}} using @uref{{program_homepage}, @emph{{program}}}.
This approach is admittedly a bit complicated. Its usefulness is that it supports having translations available in different encodings for encodings which can be covered by @-commands, and also specifying how the formatting for some commands is done, independently of the output format—yet still be language-dependent. For example, the ‘@pxref’ translation string can be like this:
see {node_file_href} section `{section}' in @cite{{book}}
which allows for specifying a string independently of the output format, while nevertheless with rich formatting it may be translated appropriately in many languages.