Internationalization (commonly abbreviated
i18n) is a topic which covers many
areas: more than just translating UI strings, it involves changing settings and
defaults to match the customs and conventions of the locale a program is being
run in. For example, days of the week, human name formats, currencies, etc.
- Design projects to be internationalized from the beginning.
- Use gettext (not intltool) for string translation.
- Remember that all strings are in UTF-8, and may contain multi-byte characters.
- Programs cannot reasonably implement changing locales at runtime.
Documenting the whole process of preparing a project for internationalisation is beyond the scope of this document, but some good guides exist:
- GNOME developer translation guidelines
- gtkmm translation guidelines (aimed at C++ programmers, but widely applicable to C programmers)
- GLib internationalization API reference
It is important to prepare a project for internationalization early in its lifetime, otherwise non-internationalizable programming practices creep in, and are hard to eliminate. For example, splitting strings into multiple translation units.
To add internationalization support to a project, follow the
which can be summarised as adding the following to
AM_GNU_GETTEXT_VERSION([0.19]) AM_GNU_GETTEXT([external]) GETTEXT_PACKAGE=AC_PACKAGE_TARNAME AC_DEFINE_UNQUOTED(GETTEXT_PACKAGE, ["$GETTEXT_PACKAGE"], [Define to the Gettext package name]) AC_SUBST(GETTEXT_PACKAGE)
Note that intltool is outdated, and we only need to use gettext.
Makefile.am. Then create an empty
po/POTFILES.in file (which will be
modified when files are marked for translation), an empty
file (which will be modified when extra translation languages are
added), and create
DOMAIN = $(PACKAGE)-$(VERSION) COPYRIGHT_HOLDER = MSGID_BUGS_ADDRESS = EXTRA_LOCALE_CATEGORIES = PO_DEPENDS_ON_POT = no XGETTEXT_OPTIONS = \ --from-code=UTF-8 \ --keyword=_ --flag=_:1:pass-c-format \ --keyword=N_ --flag=N_:1:pass-c-format \ --flag=g_log:3:c-format --flag=g_logv:3:c-format \ --flag=g_error:1:c-format --flag=g_message:1:c-format \ --flag=g_critical:1:c-format --flag=g_warning:1:c-format \ --flag=g_print:1:c-format \ --flag=g_printerr:1:c-format \ --flag=g_strdup_printf:1:c-format --flag=g_strdup_vprintf:1:c-format \ --flag=g_printf_string_upper_bound:1:c-format \ --flag=g_snprintf:3:c-format --flag=g_vsnprintf:3:c-format \ --flag=g_string_sprintf:2:c-format \ --flag=g_string_sprintfa:2:c-format \ --flag=g_scanner_error:2:c-format \ --flag=g_scanner_warn:2:c-format subdir = po top_builddir = ..
These should be committed to git.
No other translation infrastructure files should be committed to git, especially not the following. See the module setup guidelines for more information.
All strings in GLib, unless otherwise specified, are in Unicode, encoded as UTF-8. They must be handled as such, which means all string manipulation must be done in terms of Unicode characters, rather than bytes. In many cases, string manipulation functions do not need to differentiate between the two; manual array indexing is a situation where you should be careful.
GLib provides a set of UTF-8-safe versions of standard C string manipulation functions, which should always be used instead of the standard C ones.
When displaying sorted strings in the UI, care needs to be taken to ensure the
strings are sorted using Unicode algorithms, rather than plain ASCII
algorithms. This means using
strcmp() to establish an
order between two strings.
Furthermore, if section headings need to be used for splitting a list into
alphabetical sections, they need to be generated using the
current locale’s alphabet,
rather than just the A–Z
English alphabet. One
approach to doing this would be to extract the first character of each item’s
then using it as a section heading if it’s considered alphabetic for the
current locale (using
Changing locale at runtime is not safe, as it requires calling
which is explicitly not thread safe. It also theoretically involves more than
just changing UI strings — it involves changing date formats, number formats,
and the output of any code which is predicated on those. The impacts of
changing locale can be far-reaching and subtle.
To change the locale of an application, the application has to be restarted.
When referring to languages (e.g. in configuration files or preferences), always use the ISO-639 language codes, as used by gettext.