Setup i18n gettext in your PHP application

Publié le 28 septembre 2009 par Theclimber

What is gettext and why use it?

gettext is the GNU internationalization and localization (i18n) library. It is commonly used for writing multilingual programs. It has an implementation in a lot of different languages and it's also commonly used in PHP applications.

But what does you mean by internationalisation? Actually, when you write computer code you are also going to write into your code some sentences which will be prompted to the used who is running the application. Those sentences are always written in a language of your choice. But what if that person doesn't understand that language.

The first reaction to solve this problem would be to say : "Ok, but I'm gonna make another version of the code in an other language. I'll translate all those sentences so that my application could be used by other people". And we agree, this is indeed the first solution we get. But this is not optimal since you decide to modify your intial app, you'll have to modify all the translated app too and this is not an issue. It's totally broken to work like this because it imply an enormous quantity of duplicated code and a big amount of work !

That's the moment when gettext came and solved all your problems ! Indeed, the gettext solution proposes te replace all those strings with a call to a gettext function with your sentence as parameter. This function check the chosen language and if it knows a translation of the sentence in that language, it returns the translated sentence, otherwise it returns the initial sentence.

Setup Gettext

Install the gettext library

To use the gettext functions we will use in this tutorial you'll need to install and to import the php-gettext library into your php application. You can easyly find it in the directory /usr/share/php/php-gettext. So go there and pick the directory to put it in your app.

Once done, you'll need to import your php-gettext library in all the files of your application. So let's make a generic file called i18n.php which will contain all the i18n params :

<?php
require_once(dirname(__FILE__).'/lib/gettext/gettext.inc');
require_once(dirname(__FILE__).'/config.php');

$locale = BP_LANG;
$textdomain="my_project";
if (empty($locale))
	$locale = 'fr';
if (isset($_GET['locale']) & !empty($_GET['locale']))
	$locale = $_GET['locale'];
putenv('LANGUAGE='.$locale);
putenv('LANG='.$locale);
putenv('LC_ALL='.$locale);
putenv('LC_MESSAGES='.$locale);
T_setlocale(LC_ALL,$locale);
T_setlocale(LC_CTYPE,$locale);

$locales_dir = dirname(__FILE__).'/../i18n';
T_bindtextdomain($textdomain,$locales_dir);
T_bind_textdomain_codeset($textdomain, 'UTF-8'); 
T_textdomain($textdomain);
?>

And if you are observer, you see we put a mechanism in our i18n.php file to test our translation app easyly. Indeed, if you don't want to change the locale each time you want to test another language, you just can add a parameter to your php query in your browser to set the language of your choice. Like /index.php?locale=en will give you english and index.php?locale=fr will give you french. This makes it easy for testing.

Convert all your strings into gettext strings

First you need to change your code and to use Gettext for all you translatable strings. There are multiple situations you will encounter : If you are between in <?php> tags or outside of them. So let's see how we can do :

<?php
echo '<h1>'.T_('title').'</h1>';
?>
<p><?=T_("Welcome to My PHP Application");?></p>
<p><?=T_gettext("Have a nice day");?></p>

And if you want to use some PHP variables into your text you can do it by using sprintf :

<?php
echo '<h1>'.sprintf(T_('The story of %s'), $author).'</h1>';
?>

Of course, you'll also have more border-line situation as the management of plural forms. In english the plural form is not used on the same manner than in other languages so we'll have to manage it also during the translation operation. Let see how we can for example manage the situation of a variable which determine the plurality of a sentence :

<?php
$n_windows = 5;
# The solution with simple string : 
printf(T_ngettext("%d window", "%d windows", $n_windows), $n_windows);
# Or the solution with composed strings :
echo sprintf(T_ngettext("There is %d window", "There are %d windows", $n_windows), $n_windows)."in that room";
?>

Here the %d value will represent the cardinality of the string and will be adapted in function of his value. If %d is equal to 1 it will be singular, and if %d is equal to more than 1, it will be plural.

Extract all your string for translation

first be sure to create a directory called "i18n" in the root of your application. We will use this directory for our translations (many tutorials are calling the directory "locales" but I prefer "i18n" ... you are of course free to make your own choice, be sure to adapt the path if needed).

Now that all the strings of your PHP application are converted, we will need to extract them. Here comes the moment when we'll need gettext :

xgettext -kT_gettext -kT_ --from-code utf-8 -d my_project -o i18n/my_project.pot -L PHP --no-wrap -f files.txt

This will create a file called my_project.pot in your i18n directory

Create the language files :

The first time you create your translation file you have to use the msginit command :

msginit -l en -o i18n/my_project_en.po -i i18n/my_project.pot

If this is not the first time you extract your messages, you may want to only merge the old files with the new strings. You don't want to erase your previous translations. So therefor you have to use this command :

msgmerge -U i18n/my_project_fr.po i18n/my_project.pot

The old translated strings will stay translated. The similar string will be guessed by gettext and become fuzzy and all the others will be added. If there are strings which are not used anymore, they'll be added at the end of your po file but with a comment tag '#'.

Translate your app

Now it's time to work on the translation itself. Everything is ready to work with the internationalisation mechanism, but without translation it won't work. So open the created .po files and let's start translation. Be carefull to translate everything on the proper manner and if there are some variables to translate, do it carefully.

Compile and enable your translations

Once everything is translated, it's time to compile and to enable the translations. The tree structure of your i18n files will be like this :

i18n
     /fr
          /LC_MESSAGES
              my_project.mo
     /en
          /LC_MESSAGES
              my_project.mo
     /my_project.pot
     /my_project_en.po
     /my_project_fr.po

This is the last step. So go into your shell and execute the following command :

msgfmt -c -v -o i18n//fr/LC_MESSAGES/my_project.mo i18n/my_project_fr.po
4 messages translated.

That's it. Verify that the .mo file is well created. Now it's should work. Let's change the locale and it'll change the language. Isn't it beautyfull?

Strange, I didn't thought gettext was like that !

for those who are used to gettext, there is not question of T_ before the strings, only a _("string") or gettext("string"). So yes, if you want to make internationalisation possible you have to use a server where all (all of them) the locales are installed, and this is most likely impossible when you are working on a server which is not yours. That's why the specific functions of php-gettext are so usefull because they permit to become server-configuration independant.

I hope this tutorial make things more clear for you. If you still have some questions, don't hesitate to post them in the comments.