Internationalization (I18N)

When I was asked to set up a small web site for a relative, I started to search for a small CMS, which stores its data in the filesystem as opposed to a database.

Soon I discovered GetSimple, however, it did not have support for multilanguage sites. But it was very extensible, so I created a plugin, which allows your web site to use and switch between multiple languages and which, incidentially, provides a hierarchical, multi language aware, navigation.

To experience this plugin here on this website, change your browser's language settings (this site supports german and english) and reload the page, follow this link to see this page in german or permanently switch the language with the links on the top right of the page.

Here you can find more screenshots of the administration.

Installation

Download the plugin from http://get-simple.info/extend/plugin/i18n/69/, unzip it and copy it to the plugins directory of your GetSimple installation.

  • in your template(s) replace get_navigation(return_page_slug()) with get_i18n_navigation(return_page_slug()) and get_component(id) with get_i18n_component(id) and get_header() with get_i18n_header().
  • You can change the default language on the I18N view.

USAGE

Go to the Pages tab and enter the two-letter language code of your default language and press Save.

Create another language version of a page by creating a new page and (after opening the page options) naming its URL/Slug as that of the default language version + "_" + language code, e.g. for german pages:

  • index -> index_de
  • my-page -> my-page_de

When you have created one page of a language, you can just click on the [+] symbol at the right hand side of an empty cell in the table on the Page Management page.

To display a page in another than the user's prefered language, a link like http://my.site/my/page?lang=de (German) can be used. You can use the helper function return_i18n_lang_url to create a link to the current page in another language:

<a href="<?php echo htmlspecialchars(return_i18n_lang_url('de')); ?>">In deutsch</a>

You can also use the helper function find_i18n_url to create a link to another page in a specific language:

<a href="<?php echo htmlspecialchars(find_i18n_url('my-page','my-page-parent','de')); ?>">Meine Seite in deutsch</a>

To switch the language for the current session, add links like http://my.site/?setlang=de (German) to your template or home page. You can use the helper function return_i18n_setlang_url to switch the language, but stay on the same page:

<a href="<?php echo htmlspecialchars(return_i18n_setlang_url('de')); ?>">deutsch</a>

Canonical URLs

By default I18N will use the same language independent URL for all languages and display the actual language based on the user's language settings.

To make pages distinguishable for Google & Co, get_i18n_header() automatically generates  a canonical URL with the language in the header, e.g.:

<link rel="canonical" href="http://mysite.com/mypage/?lang=en" />

If you fear that the parameter will negatively influence Google's score, you can include the language directly in the URL separated by a special character, e.g. ':' or '$'. Just add the following to your gsconfig.php:

define('I18N_SEPARATOR',':');

And use rules like the following in your .htaccess:

...
RewriteRule /?([A-Za-z0-9-]+):([a-z][a-z])/?$ index.php?id=$1&lang=$2 [QSA,L]
RewriteRule /?([A-Za-z0-9-]+)/?$ index.php?id=$1 [QSA,L]

This way the the canonical URL will be generated as:

<link rel="canonical" href="http://mysite.com/mypage:en/" />

Or you can always include the language in the URL as described below.

Set a default language other than the user's preference

If you want to ignore the user's language and use another default language, e.g. with multiple domains for a single site, you can use a .htaccess like the following:

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.+ - [L]
RewriteCond %{HTTP_HOST} ^my.domain.de [NC]
RewriteCond %{QUERY_STRING} !(^|&)(set)?lang=
RewriteCond %{HTTP_COOKIE} !(^|;)\s*language=
RewriteRule ^(.*)? $1?lang=de [QSA,DPI]
RewriteCond %{HTTP_HOST} ^my.domain.fr [NC]
RewriteCond %{QUERY_STRING} !(^|&)(set)?lang=
RewriteCond %{HTTP_COOKIE} !(^|;)\s*language=
RewriteRule ^(.*)? $1?lang=fr [QSA,DPI]
RewriteRule /?([A-Za-z0-9_-]+)/?$ index.php?id=$1 [QSA,L]
RewriteRule ^ index.php [L]

This sets german as the default language on my.domain.de, french on my.domain.fr and the user's preferred language on other domains (e.g. my.domain.com).

Include the Language in the URL

Normally the language is not displayed in the URL but automatically determined by the user's preferences.

If in spite of this you really want to include the language in the URL, you need to specify a permalink structure in the website settings in the admin settings and include the placeholder %language%, e.g.:

%language%/%parent%/%slug%/

Additionally you need to modify the rules in your root .htaccess file, e.g. (above permalink structure, english and german languages):

...
RewriteRule ^/?$ en/ [R,L]   # optional
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(en|de)/(.*?/)?([A-Za-z0-9-]+)/?$ index.php?id=$3&lang=$1 [QSA,L]
RewriteRule ^(en|de)/?$ index.php?lang=$1 [QSA,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule /?([A-Za-z0-9-]+)/?$ index.php?id=$1 [QSA,L]

The first rule is optional: if you omit it the best matching language according to the user's browser settings is chosen.

Be aware that if a page does not exist in the requested language, the next best language is used and the language in the URL and the actual language will differ.

You can also use the placeholder %nondefaultlanguage% to only include the language for languages other than the default language, e.g.

%nondefaultlanguage%/%parent%/%slug%/

Your .htaccess file would look like this (not tested - english as default language plus german, french):

...
RewriteRule ^/?$ en/ [R,L]   # optional
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(fr|de)/(.*?/)?([A-Zat-z0-9-]+)/?$ index.php?id=$3&lang=$1 [QSA,L]
RewriteRule ^(fr|de)/?$ index.php?lang=$1 [QSA,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*?/)?([A-Za-z0-9-]+)/?$ index.php?id=$2&lang=en [QSA,L]

Language dependent Content

If you want to show language dependent content (beside the page content itself and the navigation) in the template (e.g. a sidebar), you have the following possibilities:

Components:

Create components (Theme/Edit Components) named mycomponentname (default language) and mycomponentname_xx (for languages xx) with the content in the respective language (e.g. components sidebar, sidebar_de, sidebar_es) and add the following to your template:

<?php get_i18n_component('mycomponentname'); ?>

Pages:

If you want a non-technical user to edit the content in the WYSIWYG editor, instead create multiple pages mypagename (default language) and mypagename_xx (for languages xx) with the respective contents and make sure to uncheck Add to Menu. Then include the page in the template with the following code:

<?php get_i18n_content('mypagename'); ?>

Conditions:

If you have only small texts that need not be changed by non-technical users, you can also directly add conditions to your template:

<?php if ($language == 'en') { ?>English text<?php } ?>
<?php if ($language == 'de') { ?>Deutscher Text<?php } ?>
<?php if ($language == 'es') { ?>...<?php } ?>

Single Language

If you like the plugin (maybe because of the improved pages view in the administration) but you really have only one language (remember: no _ in your page slugs), you can hide the multi-language comments in the pages view by adding the following to your gsconfig.php:

define('I18N_SINGLE_LANGUAGE',true);

Hierarchical URLs

Since version 3.2 you can use the placeholder %parents% in your fancy URL in order to include all ancestors of your page and not just the immediate parent.

API

Public functions:

  • return_i18n_default_language() returns the default language.
  • return_i18n_languages() returns an array of user requested languages with the best first, e.g. ( 'de', 'fr', 'en' )
  • return_i18n_available_languages($slug=null) returns the available languages of the site or - if $slug is given - the available languages for that page (version 2.5.3+).
  • return_i18n_page_data($slug)returns the xml data for the best fitting language version of the $slug
  • find_i18n_url($slug,$parent,$language=null,$type='full') returns the URL to the page identified by $slug/$parent in the given $language (see also core function find_url)
  • return_i18n_lang_url($language=null) returns the URL to the current page in the given $language (if null, the default language is used)
  • return_i18n_setlang_url($language) returns the URL to the current page which also sets the preferred $language. If the current URL does not have a parameter lang then this causes the page to be displayed in the given $language (if it exists).
  • return_i18n_component($slug) returns the component content (unprocessed).
  • i18n_init() - call this function from another (I18N aware) plugin in the hook index-pretemplate, if the plugin uses any language-dependent page data.

Display functions:

  • get_i18n_header() is like get_header, but tags beinning with _ are ignored and the canonical URL contains the language
  • get_i18n_content($slug) outputs the content of page $slug best fitting the user's language(s)
  • get_i18n_component($id, $param1, ...) outputs the (localized) component
  • get_i18n_link($slug) outputs a link to the given page in the best language