I18N is an important piece of any modern program. Unfortunately, setting up i18n in your program is often a confusing process. The functions provided here aim to make the programming side of that a little easier.
Most projects will be able to do something like this when they startup:
# myprogram/__init__.py:
import os
import sys
from kitchen.i18n import easy_gettext_setup
_, N_ = easy_gettext_setup('myprogram', localedirs=(
os.path.join(os.path.realpath(os.path.dirname(__file__)), 'locale'),
os.path.join(sys.prefix, 'lib', 'locale')
))
Then, in other files that have strings that need translating:
# myprogram/commands.py:
from myprogram import _, N_
def print_usage():
print _(u"""available commands are:
--help Display help
--version Display version of this program
--bake-me-a-cake as fast as you can
""")
def print_invitations(age):
print _('Please come to my party.')
print N_('I will be turning %(age)s year old',
'I will be turning %(age)s years old', age) % {'age': age}
See the documentation of easy_gettext_setup() and get_translation_object() for more details.
See also
- gettext
- for details of how the python gettext facilities work
- babel
- The babel module for in depth information on gettext, message catalogs, and translating your app. babel provides some nice features for i18n on top of gettext
easy_gettext_setup() should satisfy the needs of most users. get_translation_object() is designed to ease the way for anyone that needs more control.
Setup translation functions for an application
Parameters: |
|
---|---|
Returns: | tuple of the gettext function and gettext function for plurals |
Setting up gettext can be a little tricky because of lack of documentation. This function will setup gettext using the Class-based API for you. For the simple case, you can use the default arguments and call it like this:
_, N_ = easy_gettext_setup()
This will get you two functions, _() and N_() that you can use to mark strings in your code for translation. _() is used to mark strings that don’t need to worry about plural forms no matter what the value of the variable is. N_() is used to mark strings that do need to have a different form if a variable in the string is plural.
See also
Note
The gettext functions returned from this function should be superior to the ones returned from gettext. The traits that make them better are described in the DummyTranslations and NewGNUTranslations documentation.
Changed in version kitchen-0.2.4: ; API kitchen.i18n 2.0.0 Changed easy_gettext_setup() to return the lgettext functions instead of gettext functions when use_unicode=False.
Get a translation object bound to the message catalogs
Parameters: |
|
---|---|
Returns: | Translation object to get gettext methods from |
If you need more flexibility than easy_gettext_setup(), use this function. It sets up a gettext Translation object and returns it to you. Then you can access any of the methods of the object that you need directly. For instance, if you specifically need to access lgettext():
translations = get_translation_object('foo')
translations.lgettext('My Message')
This function is similar to the python standard library gettext.translation() but makes it better in two ways
objects by default. These are superior to the gettext.GNUTranslations and gettext.NullTranslations objects because they are consistent in the string type they return and they fix several issues that can causethe python standard library objects to throw UnicodeError.
The latter is important when setting up gettext in a portable manner. There is not a common directory for translations across operating systems so one needs to look in multiple directories for the translations. get_translation_object() is able to handle that if you give it a list of directories to search for catalogs:
translations = get_translation_object('foo', localedirs=(
os.path.join(os.path.realpath(os.path.dirname(__file__)), 'locale'),
os.path.join(sys.prefix, 'lib', 'locale')))
This will search for several different directories:
This allows gettext to work on Windows and in development (where the message catalogs are typically in the toplevel module directory) and also when installed under Linux (where the message catalogs are installed in /usr/share/locale). You (or the system packager) just need to install the message catalogs in /usr/share/locale and remove the locale directory from the module to make this work. ie:
In development:
~/foo # Toplevel module directory
~/foo/__init__.py
~/foo/locale # With message catalogs below here:
~/foo/locale/es/LC_MESSAGES/foo.mo
Installed on Linux:
/usr/lib/python2.7/site-packages/foo
/usr/lib/python2.7/site-packages/foo/__init__.py
/usr/share/locale/ # With message catalogs below here:
/usr/share/locale/es/LC_MESSAGES/foo.mo
Note
This function will setup Translation objects that attempt to lookup msgids in all of the found message catalogs. This means if you have several versions of the message catalogs installed in different directories that the function searches, you need to make sure that localedirs specifies the directories so that newer message catalogs are searched first. It also means that if a newer catalog does not contain a translation for a msgid but an older one that’s in localedirs does, the translation from that older catalog will be returned.
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 Add more parameters to get_translation_object() so it can more easily be used as a replacement for gettext.translation(). Also change the way we use localedirs. We cycle through them until we find a suitable locale file rather than simply cycling through until we find a directory that exists. The new code is based heavily on the python standard library gettext.translation() function.
The standard translation objects from the gettext module suffer from several problems:
DummyTranslations and NewGNUTranslations were written to fix these issues.
Safer version of gettext.NullTranslations
This Translations class doesn’t translate the strings and is intended to be used as a fallback when there were errors setting up a real Translations object. It’s safer than gettext.NullTranslations in its handling of byte str vs unicode strings.
Unlike NullTranslations, this Translation class will never throw a UnicodeError. The code that you have around a call to DummyTranslations might throw a UnicodeError but at least that will be in code you control and can fix. Also, unlike NullTranslations all of this Translation object’s methods guarantee to return byte str except for ugettext() and ungettext() which guarantee to return unicode strings.
When byte str are returned, the strings will be encoded according to this algorithm:
For ugettext() and ungettext(), we go through the same set of steps with the following differences:
is an extension to the python standard library gettext that specifies what charset a message is encoded in when decoding a message to unicode. This is used for two purposes:
Any characters that aren’t able to be transformed from a byte str to unicode string or vice versa will be replaced with a replacement character (ie: u'�' in unicode based encodings, '?' in other ASCII compatible encodings).
See also
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 * Although we had adapted gettext(), ngettext(), lgettext(), and lngettext() to always return byte str, we hadn’t forced those byte str to always be in a specified charset. We now make sure that gettext() and ngettext() return byte str encoded using output_charset if set, otherwise charset and if neither of those, UTF-8. With lgettext() and lngettext() output_charset if set, otherwise locale.getpreferredencoding(). * Make setting input_charset and output_charset also set those attributes on any fallback translation objects.
Set the output charset
This serves two purposes. The normal gettext.NullTranslations.set_output_charset() does not set the output on fallback objects. On python-2.3, gettext.NullTranslations objects don’t contain this method.
Safer version of gettext.GNUTranslations
gettext.GNUTranslations suffers from two problems that this class fixes.
When byte str are returned, the strings will be encoded according to this algorithm:
For ugettext() and ungettext(), we go through the same set of steps with the following differences:
an extension to the python standard library gettext that specifies what charset a message is encoded in when decoding a message to unicode. This is used for two purposes:
Any characters that aren’t able to be transformed from a byte str to unicode string or vice versa will be replaced with a replacement character (ie: u'�' in unicode based encodings, '?' in other ASCII compatible encodings).
See also
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 Although we had adapted gettext(), ngettext(), lgettext(), and lngettext() to always return byte str, we hadn’t forced those byte str to always be in a specified charset. We now make sure that gettext() and ngettext() return byte str encoded using output_charset if set, otherwise charset and if neither of those, UTF-8. With lgettext() and lngettext() output_charset if set, otherwise locale.getpreferredencoding().