Next: No embedded URLs, Previous: Split at paragraphs, Up: Preparing Translatable Strings [Contents][Index]
Hardcoded string concatenation is sometimes used to construct English strings:
strcpy (s, "Replace "); strcat (s, object1); strcat (s, " with "); strcat (s, object2); strcat (s, "?");
In order to present to the translator only entire sentences, and also
because in some languages the translator might want to swap the order
of object1
and object2
, it is necessary to change this
to use a format string:
sprintf (s, "Replace %s with %s?", object1, object2);
In many programming languages, a particular operator denotes string concatenation at runtime (or possibly at compile time, if the compiler supports that).
std::string
objects
is denoted by the ‘+’ operator.
So, for example, in Java, you would change
System.out.println("Replace "+object1+" with "+object2+"?");
into a statement involving a format string:
System.out.println( MessageFormat.format("Replace {0} with {1}?", new Object[] { object1, object2 }));
Similarly, in C#, you would change
Console.WriteLine("Replace "+object1+" with "+object2+"?");
into a statement involving a format string:
Console.WriteLine( String.Format("Replace {0} with {1}?", object1, object2));
In some programming languages, it is possible to have strings with embedded expressions. The expressions can refer to variables of the program. The value of such an expression is converted to a string and inserted in place of the expression; but no formatting function is called.
"Hello, $name!"
or "Hello, ${name}!"
.
f"Hello, {name}!"
.
$"Hello, {name}!"
.
"Hello, $name!"
.
"Hello, $name!"
.
"Hello, $name!"
.
"Hello, #{name}!"
.
`Hello, ${name}!`
.
These cases are effectively string concatenation as well, just with a different syntax.
So, for example, in Python, you would change
print (f'Replace {object1.name} with {object2.name}?')
into a statement involving a format string:
print ('Replace %(name1)s with %(name2)s?' % { 'name1': object1.name, 'name2': object2.name })
or equivalently
print ('Replace {name1} with {name2}?' .format(name1 = object1.name, name2 = object2.name))
And in JavaScript, you would change
print (`Replace ${object1.name} with ${object2.name}?`)
into a statement involving a format string:
print ('Replace %s with %s?'.format(object1.name, object2.name))
Specifically in JavaScript, an alternative is to use a tagged template literal:
print (tag`Replace ${object1.name} with ${object2.name}?`)
and pass an option ‘--tag=tag:format’ to xgettext
.
Format strings with embedded named references are different:
They are suitable for internationalization, because it is possible
to insert a call to the gettext
function (that will return a
translated format string) before the argument values are
inserted in place of the placeholders.
The format string types that allow embedded named references are:
<inttypes.h>
macrosA similar case is compile time concatenation of strings. The ISO C 99
include file <inttypes.h>
contains a macro PRId64
that
can be used as a formatting directive for outputting an ‘int64_t’
integer through printf
. It expands to a constant string, usually
"d" or "ld" or "lld" or something like this, depending on the platform.
Assume you have code like
printf ("The amount is %0" PRId64 "\n", number);
The gettext
tools and library have special support for these
<inttypes.h>
macros. You can therefore simply write
printf (gettext ("The amount is %0" PRId64 "\n"), number);
The PO file will contain the string "The amount is %0<PRId64>\n".
The translators will provide a translation containing "%0<PRId64>"
as well, and at runtime the gettext
function’s result will
contain the appropriate constant string, "d" or "ld" or "lld".
This works only for the predefined <inttypes.h>
macros. If
you have defined your own similar macros, let’s say ‘MYPRId64’,
that are not known to xgettext
, the solution for this problem
is to change the code like this:
char buf1[100]; sprintf (buf1, "%0" MYPRId64, number); printf (gettext ("The amount is %s\n"), buf1);
This means, you put the platform dependent code in one statement, and the internationalization code in a different statement. Note that a buffer length of 100 is safe, because all available hardware integer types are limited to 128 bits, and to print a 128 bit integer one needs at most 54 characters, regardless whether in decimal, octal or hexadecimal.
Next: No embedded URLs, Previous: Split at paragraphs, Up: Preparing Translatable Strings [Contents][Index]