Managing Static Translations

The Basics

An aggressive summary of the Django translation documentation

Essentially what we're doing is changing every place where our code contains text which is displayed to the user (e.g. Hello, USER) to instead ask a translation system for the current language equivalent. Django uses the very common GNU gettext system.

Change our source code to ask gettext for each string from django.utils.translation import ugettext as _ print _("Welcome to our site") {% load i18n %} <h1>{% trans "Welcome to our site" %}</h1>
Run django-admin.py makemessages -a
Translate the files which makemessages created in your projects' locale directory
Run django-admin.py compilemessages

The Not-So-Basics

Templates: trans vs. blocktrans

The trans templatetag is shorter but limited to simple string values or entire variables. If you need to include variables inside a string or process your input in any way, use blocktrans. Since the format strings are part of the text sent to translators having obvious variable names can avoid potential confusion.


                {% blocktrans with username=request.user.username %}You're logged in as {{ username }}{% endblocktrans %}


                {% blocktrans with modification_date=object.updated|date %}Last updated: {{ modification_date }}{% endblocktrans %}

The Not-So-Basics

Pluralization

Proper usage of the language often requires the translator to specify multiple translations based on the number of items being referred to. This is even useful in English for avoiding sloppy forms like follower(s):


                {% blocktrans count followers=user.followers.count %}

                    You have only one follower.

                {% plural %}

                    You have {{ followers }} followers.

                {% endblocktrans %}

Gettext has support for translators adding more than one or two translations based on a particular language's rules for plural forms. The syntax is unfriendly but it allows you to correctly handle most languages

The Not-So-Basics

Context

A translator may not be able to tell what a word means without additional context (is June a month or a name? Is object a complaint or a programming term?). Django exposes the underlying gettext framework support:


                {% trans "Print" context "Type of artwork" %}


                {% trans "Print" context "Word processor command" %}

Format Localization

Django's format localization support allows you to adjust how dates and numbers are displayed. Enabling L10N = True in your settings will activate this globally but sometimes localization isn't appropriate - perhaps you need to format a number for another program and want to tell Django not to use a thousands separator:


                {% load l10n %}


                {% localize off %}

                    <div data-id="{{ obj.pk }}">

                {% endlocalize %}


                <div data-id="{{ obj.pk|unlocalize }}">

In other cases, you might want to override Django's default format rules. In one example, we needed to change the way full month names were formatted in Portuguese from 20 de Janeiro de 2012 to 20 de janeiro de 2012.

Django's date formatting codes include both F (long month name, e.g. Janeiro) and E ( locale specific alternative representation usually used for long date representation , e.g. janeiro) so we simply need to tell Django to use a different DATE_FORMAT for Portuguese. This required creating a Python module structure under my project so locale/pt/formats.py could set DATE_FORMAT = 'j \d\e E \d\e Y'.

Dealing with translation services

By now, your project has a bunch of gettext .po files containing English text which need to be translated. There are a number of commercial translation firms and some community-oriented ones such as Transifex and Google Translator Toolkit. The right choice will depend on your project, team and budget.

No matter what you pick, you're effectively sending part of your source code to non-programmers. You'll need to decide how to handle reviews, version control and how to prevent hand-editing mistakes or buggy software from destabilizing your translations. Since gettext will return the original language by default if there is no available translation, you particularly need to regularly review the other supported languages.

Translation file utilities

Gettext provides a number of handy utilities for working with .po files without editing them by hand. In particular, msgcat, msgfmt (consistently formats files) and msggrep (structure-aware search) all have cumbersome UIs but will save you hassle in the long run.

By default, .po files include a comment with the source locations for each msgid and string. This is helpful context in some cases but it has the disadvantage of causing your .po files to be different every time you run makemessages after an unrelated source change which changes the line number of a translated string. Git's textconv feature allows me to run an external program when diff-ing a .po file and the msgcat utility allows me to sort the file and strip the location comments so I can focus on changes to the actual translation strings.

~/.gitattributes: *.po diff=msgcat

~/.gitconfig: [diff "msgcat"] textconv = msgcat --no-location --sort-output