Next: , Previous: , Up: Preparing Translatable Strings   [Contents][Index]


4.3.4 No string concatenation

Hardcoded string concatenation is sometimes used to construct English strings:

strcpy (s, "Replace ");
strcat (s, object1);
strcat (s, " with ");
strcat (s, object2);
strcat (s, "?");

In order to present to the translator only entire sentences, and also because in some languages the translator might want to swap the order of object1 and object2, it is necessary to change this to use a format string:

sprintf (s, "Replace %s with %s?", object1, object2);

String concatenation operator

In many programming languages, a particular operator denotes string concatenation at runtime (or possibly at compile time, if the compiler supports that).

So, for example, in Java, you would change

System.out.println("Replace "+object1+" with "+object2+"?");

into a statement involving a format string:

System.out.println(
    MessageFormat.format("Replace {0} with {1}?",
                         new Object[] { object1, object2 }));

Similarly, in C#, you would change

Console.WriteLine("Replace "+object1+" with "+object2+"?");

into a statement involving a format string:

Console.WriteLine(
    String.Format("Replace {0} with {1}?", object1, object2));

Strings with embedded expressions

In some programming languages, it is possible to have strings with embedded expressions. The expressions can refer to variables of the program. The value of such an expression is converted to a string and inserted in place of the expression; but no formatting function is called.

These cases are effectively string concatenation as well, just with a different syntax.

So, for example, in Python, you would change

print (f'Replace {object1.name} with {object2.name}?')

into a statement involving a format string:

print ('Replace %(name1)s with %(name2)s?'
       % { 'name1': object1.name, 'name2': object2.name })

or equivalently

print ('Replace {name1} with {name2}?'
       .format(name1 = object1.name, name2 = object2.name))

And in JavaScript, you would change

print (`Replace ${object1.name} with ${object2.name}?`)

into a statement involving a format string:

print ('Replace %s with %s?'.format(object1.name, object2.name))

Specifically in JavaScript, an alternative is to use a tagged template literal:

print (tag`Replace ${object1.name} with ${object2.name}?`)

and pass an option ‘--tag=tag:format’ to xgettext.

Format strings with embedded named references

Format strings with embedded named references are different: They are suitable for internationalization, because it is possible to insert a call to the gettext function (that will return a translated format string) before the argument values are inserted in place of the placeholders.

The format string types that allow embedded named references are:

The <inttypes.h> macros

A similar case is compile time concatenation of strings. The ISO C 99 include file <inttypes.h> contains a macro PRId64 that can be used as a formatting directive for outputting an ‘int64_t’ integer through printf. It expands to a constant string, usually "d" or "ld" or "lld" or something like this, depending on the platform. Assume you have code like

printf ("The amount is %0" PRId64 "\n", number);

The gettext tools and library have special support for these <inttypes.h> macros. You can therefore simply write

printf (gettext ("The amount is %0" PRId64 "\n"), number);

The PO file will contain the string "The amount is %0<PRId64>\n". The translators will provide a translation containing "%0<PRId64>" as well, and at runtime the gettext function’s result will contain the appropriate constant string, "d" or "ld" or "lld".

This works only for the predefined <inttypes.h> macros. If you have defined your own similar macros, let’s say ‘MYPRId64’, that are not known to xgettext, the solution for this problem is to change the code like this:

char buf1[100];
sprintf (buf1, "%0" MYPRId64, number);
printf (gettext ("The amount is %s\n"), buf1);

This means, you put the platform dependent code in one statement, and the internationalization code in a different statement. Note that a buffer length of 100 is safe, because all available hardware integer types are limited to 128 bits, and to print a 128 bit integer one needs at most 54 characters, regardless whether in decimal, octal or hexadecimal.


Next: No embedded URLs, Previous: Split at paragraphs, Up: Preparing Translatable Strings   [Contents][Index]