Gettext String IDs Considered Harmful ===================================== Documentation for gettext at . Examples of automated format and spell checks and tracking of translations can be found here: * * * * Listed down are the main disadvantages that logical strings are causing. I may have forgotten some more others. Compiler checks ~~~~~~~~~~~~~~~ A wrongly specified format string or a missing argument may produce a warning in normal situations, bogus data or even a runtime crash. Example: * Logical: Bogus warnings + Runtime crash printf(_("foo_bar"), 450000000, 'c'); msgid "foo_bar" msgstr "value %ld with string %s and %s.\n" * English: Proper build time warnings printf(_("value %ld with string %s and %s.\n"), 450000, 'c'); $ gcc -Wall -o foo foo.c foo.c:6: warning: format '%ld' expects type 'long int', but argument 2 has type 'int' foo.c:6: warning: format '%s' expects type 'char *', but argument 3 has type 'int' foo.c:6: warning: too few arguments for format Translation consistency checks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Some checks can be performed automatically if using the po tools with normal strings. Examples: * Logical: No warnings nor errors msgid "foo_bar" msgstr "this is an integer %d.\n" msgid "foo_bar" msgstr "això és un enter %f %s." * English: Fatal errors #, c-format msgid "this is an integer %d.\n" msgstr "això és un enter %f %s." $ msgfmt -c foo.po foo.po:14: number of format specifications in 'msgid' and 'msgstr' does not match foo.po:17: `msgid' and `msgstr' entries do not both end with '\n' msgfmt: found 2 fatal errors Fuzzy handling ~~~~~~~~~~~~~~ Changes to the original english string do not trigger changes to the translations. Also translations marked "fuzzy" are not displayed to the user as they are obviously wrong and not matching the meaning of the current original text. Example: * Logical: printf(_("foo_bar"), 4, "baz"); msgid "foo_bar" msgstr "This is a string" - We fix a typo. msgid "foo_bar" msgstr "This is a string." - Translations do not get noticed. Manual review needed. msgid "foo_bar" msgstr "Això és una cadena de text" * English: printf(_("This is a string"), 4, "baz"); msgid "This is a string" msgstr "Això és una cadena de text" - We fix a typo. The translation gets "fuzzy". #, fuzzy msgid "This is a string." msgstr "Això és una cadena de text" - We fix the translations. No "fuzzy" anymore. msgid "This is a string." msgstr "Això és una cadena de text." Repeated translations of the same strings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ No need to repeat translations for the same string. Even more if all translations in the project are collected in a single big po file, repeated translations can be avoided across programs, and not just across different strings in the same program. Example: * Logical: Repeated printf(_("foo_bar")); ... printf(_("bar_quux")); msgid "foo_bar" msgstr "This is a string." msgid "bar_quux" msgstr "This is a string." * English: Merged printf(_("This is a string.")); ... printf(_("This is a string.")); msgid "This is a string." msgstr "Això és una cadena de text." Incremental translations ~~~~~~~~~~~~~~~~~~~~~~~~ Fallback to english when no translation. The only problem here is for users who don't understand English, but even then it's more user friendly to display english than some non-sense, and the user can always use a dictionary. Example: * Logical: Printing non-sense msgid "foo_bar" msgstr "" output: foo_bar * English: Printing something meaningful msgid "this is a string.\n" msgstr "" output: this is a string. Meaningful output w/o locale. By default, if the user has not choosen a locale (ok, something that in our device should not happen), the messages printed will be meaninful. * Logical: Printing non-sense msgid "foo_bar" msgstr "this is a string.\n" output: foo_bar * English: Printing something meaningful msgid "this is a string.\n" msgstr "això és una cadena de text.\n" output: this is a string.