Going multilingual with wordpress

Going multi-lingual with Polylang

Recently a client requested to introduce a second language section embedded within the default language site.  The default being English and the new language being requested being Hindi, a North Indian language using the Devanagari script.

This was an interesting challenge and we set about to break new grounds in our little enterprise.  There are several challenges to ensuring a clean and elegant solution that will survive the trials of a maturing and growing site.

WordPress is well equipped to solve this problem, developed to cater to an International Word Wide Web, Worpdress comes armed with built-in translation frameworks.  However there are several steps to look-out for in order to make sure it all works properly.

I split this tutorial into the following sections:

  • Ensuring the database is configured so we can store the multi-language strings properly.
  • WordPress Admin Dashboard in other languages?
  • Is your Theme multi-lingual enabled?
  • Using a WordPress plugin to properly manage the multi-lingual content.
  • Ensuring multi-lingual scripts are properly displayed on all browsers with the correct font.
  • How to mix multi-lingual content on the same page?

So here we go, first we have to make sure we are able to handle any special scripts and their various characters.

Ensuring the database is configured so we can store the multi-language strings properly

Each character of every script are stored as bits in a computer, 8 bits make a byte, and for a long time the majority of computers where happy to “speak” English whose characters could happily fit into the 256 character space offered by a byte.  In other words, 8 bits (1s & 0s) can be arranged into 256 different and unique ways.  This was called the ASCII character set and few people complained till the day the Internet reached to every corner of the globe and many more characters had to find a place within an International set of characters.  This meant thousands of new characters. To remedy this the Unicode was introduced, which uses 16 bits (2 bytes) to enable the encoding of over 65 thousand characters.  However, since so many computers where already running happily with ASCII characters, it took some brilliant inspiration to come up wit the UTF-8 character encoding standard.

UTF-8 – the multi-lingual wizard

The beauty of UTF-8 is that the first 127 characters are stored using the ASCII 8-bit code, making it compatible with computers/software unable to figure out anything else, while reserving larger and larger amount of bits to store Unicode codes, up to 6 bytes in some cases (48 bits!).  There is a great article by Joel Sponsky on this whole universe of characters which is well worth a read.

So the bottom line is that our whole framework, database, WordPress server, and broswer need to speak and understand UTF-8 to make sure we display everything properly.

The great thing is that WordPress is by default UTF-8 compliant.  However, the database most people rely on is the open source MySQL bundled along with PHP and Apache, and the default settings for saving strings in the tables is not always set to UTF8 encoding.  When the tables are set to ascii coding, the character code is truncated to 8bits and non-latin scripts end up being corrupted in the database, displaying as a set of question marks when retrieved.

MySQL supports UTF8 and introduced variants such as the UTF8mb4 coding which allows for smarter encoding and space saving coding.  My default MySQL installation came with latin-script encoding which resulted in lots of question marks.  I selected utf8mb4_general_ci, which works just fine.  Ensuring the correct setting on the MySQL database can be done via PhpMyAdmin tool and is called the Database Collation setting.  You can find a tutorial here.

WordPress Admin Dashboard in other languages?

Next, we look at WordPress itself, as mentioned earlier it is utf8 compliant by default and the encoding is set in the wp-config.php file,

// by default wrodpress is utf8 compliant

/** Database Charset to use in creating database tables. */
define('DB_CHARSET', 'utf8');

/** The Database Collate type. Don't change this if in doubt. */
define('DB_COLLATE', '');

//however, if you are using a you can set it here... leaving it blank works just fine with MySQL utf8mb4_general_ci
define( 'DB_COLLATE', 'utf8_general_ci' );

WordPress has also been translated in many languages, so it is possible to set up translation files so that the Dashboard will appear in your selected language.  It can easily be configured for each users in the profile settings.  You can either do it manually or simply install the Native Dashboard plugin which will do it for you.  Here is what you to do if like me you don’t like to overload your WordPress installation with plugins,

  • Download the required file translation from the WordPress repository.  It should be under the /branches/3.9/messages/ section where 3.9 will be the version of WordPress you have installed.  Although not all languages are maintained properly, so you may find it instead under the /trunk/messages/ which you need to know if usually work in progress.  The file you need is the binary [languageCode]_[countryCode].mo.  You can also download the editable [languageCode]_[countryCode].po.   If you wish to create your own translation file, download any .po file from any language and use the excellent online poEditor, to replace the translation with your own.  You can then save both the mo and po file once you’re done.
  • Upload the [languageCode]_[countryCode].mo  file to your WordPress installation under the wp-content/languages/ directory.
  • Enable the language in the wp-config.php file by setting up and un-commenting the line,  define( ‘WPLANG’ , ‘fr_FR’ ); which should be in the 2nd half of your file.  The code setting is the [languageCode]_[countryCode] the name of the file you uploaded minus the extension.  In this case it French from France translation.
  • Repeat these steps for every single languages you want to enable.

You should now be able to see your Dashboard in your installed language.

Is your Theme multi-lingual enabled?

Next we need to ensure that the theme is properly translated, else all the buttons, links, dynamic messages will all appear in the default language.  A properly developed theme will come with localization enabled, in other words the possibility to incorporate specific [languageCode]_[countryCode].mo files within your theme structure.  We use the excellent Elegant Themes, which come pre-translated in Russian, German and English by default, as well as the PO files that can be translated into other languages.

However, there is also the default WordPress Themes such TwentyFourteen and their offspring which come  pre-translated in various languagesCustomizer is another excellent free theme that comes with extra features to seamlessly integrate it with the PolyLang plugin.

If you theme comes with localization enabled, you should be able to find a languages/ folder in your theme root folder within which there should be a PO file for you to translate if you language isn’t translated already.

Using a WordPress plugin to properly manage the multi-lingual content

This part is important and quite controversial as there is more than one way to go about doing this.  WordPress is a Content Management Platform (CMS) and probably the world’s best too, and therefore is designed from the ground up to dish out practically any content.  As a result languages is just another type of content as far as WordPress is concerned.  It is therefore logical that this content be treated as any other and get its separate structure, namely post within the WordPress framework.  This is the recommended way by WordPress high-priests, but as you read from the article linked here, there many other approaches.

I have read a lot of good things about the commercial WPML plugin and it is apparent that it is the standard bearer.  I have also tried the Bogo plugin which looked promising  and have many good reviews about it, but failed to work for me.  I finally settled for the excellent PolyLang plugin, which does the work quite well, although the back-end coding could be improved somewhat, but I shall not complain given that incredible amount of effort that has gone into it.

Setting up is quite easy, just upload & install it through the Plugin management section of your dashboard,

  • then make your way to Settings -> Languages in your Dashboard
  • You will see a section to upload new languages, just select the language(s) you want to enable, and you will see them being added to the list.  Note that the order is not important and you can leave it empty.  If you have more than 1 additional language to the default one you way want to order them so they appear in the right order in your menus.
  • Once you have loaded your languages, you can select the “String translation” tab at the top of the settings page where you will see a list of strings to translate in your selected languages such as the the title of your blog, and titles of registered widgets.  You can leave them in the default language if you want to enable only certain section of the site in other languages, which is what this tutorial is attempting to do.  Save your changes.
  • Finally you have a ‘Settings’ tab which allows you to initialise the multi-language content management in your dashboard.  Careful!  Once you have set up this initialisation you cannot undo some of these settings without cleaning the database.  The main setting is to tell polylang what is your current default language and it will automatically configure any content you currently have (Galleries, post, pages, categories, custom taxonomies) as content for this language.  PolyLang sets up separate language selectors in every content creation/edit page hereafter.
  • Select your default language, check the box to set all your content as this language if you want.  You can always change the default language of the site but you cannot undo the association of the default content.  There are various other settings which can modify accordingly and are quite self-explanatory.  I left mine to the default for now.

Ensuring multi-lingual scripts are properly displayed on all browsers with the correct font

You can upload new fonts in your CSS stylesheet, and enable them using the CSS selector :lang().

If you have custom posts, plugins, or themese which need to be localized (enabled for multi-language display) you need to ensure your code includes the right WordPress functionality.  For a detailed tutorial, see this article.

You need to fully translate your content in your dashboard, including menus, categories, galleries and widgets.

How to mix multi-lingual content on the same page?

The crux of this tutorial.  Till this point, your WordPress site is ready to dish out multi-lingual content as per the language selected, by default PolyLang does not show content form other languages, this means that if certain posts/galleries or other sections are not translated they are simply not shown.

The solution provided in the PolyLang documentation, is to modify the default query for a page and filter out duplicate posts in the post Loop.  This is however not a very elegant solution as it means that every theme pages need to be extended in a child theme which kind of defeats the purpose of having a child theme in the first place.

I wrote a small script that you can include in your functions.php file which will catch the query before it gets executed using WordPress filter (hook) functionality.  The beauty of hooks is that you can catch all the queries in a single location and deploy your logic once only, much more elegant than doing it in the template loop,

add_filter( 'pre_get_posts', 'get_default_language_posts' );
 function get_default_language_posts( $query ) {
   if ( $query->is_main_query() && function_exists('pll_default_language') && !is_admin()){
     $terms = get_terms('post_translations'); //polylang stores translated post IDs in a serialized array in the description field of this custom taxonomy
     $defLang =  pll_default_language(); //default lanuage of the blog
     $curLang =  pll_current_language(); //current selected language requested on the broswer
     $filterPostIDs = array();
     foreach($terms as $translation){
       $transPost = unserialize($translation->description);
       //if the current language is not the default, lets pick up the default language post
       if($defLang!=$curLang) $filterPostIDs[]=$transPost[$defLang];
       $query->set('lang' , $defLang.','.$curLang);  //select both default and current language post
       $query->set('post__not_in', $filterPostIDs); // remove the duplicate post in the default language
   return $query;

Currently galleries need to be fully duplicated (the images are re-used, not duplicated by the PolyLang plugin) in order to display them properly.  I will attempt filter script to identify gallery queries and follow a similar logic as that of posts above to remedy this.

The language switcher

The polyang plugin provides a widget to switch between languages on the front-end.  It also provides the ability to integrate it into your menu structure.  Alternatively you can also call a function in your php template which will return a <select> drop down box.

However, if like me you like more flexibility to ensure proper integration into your theme with custom CSS classes to be embeded into the menu structure, then you can re-use this script which when called in your php page will echo a menu structure <ul> with <li> elements,

function sy_polylang_switcher(){
    $translations = array();
    $translations = pll_the_languages(array('raw'=>1));
    foreach($translations as $key => $language){
                 $current = $language;
    $menu = '<ul id="lang_list" class="nav">';
        case 1:
            $alternate = end($translations);
            $menu .= '<li class="menu-item"><a href="'.$alternate['url'].'">'.$alternate['name'].'</a></li>';    
            $menu .= '<li class="current_page_item">'.$current['name'].'</li>';    
            $menu .= '<ul class="sub-menu" style="visibility:hidden;display:none">';
            foreach($translations as $language)
                    $menu .= '<li class="menu-item"><a href="'.$language['url'].'">'.$language['name'].'</a></li>';    
            $menu .= '</ul>';
    $menu .= '</ul>';
    echo '<div id="polylang_switcher">'.$menu.'</div>';

Note, that if you have more than 1 translation language, the menu appears as a sub-menu with the top level being the current language and the other languages as  a drop down list.  If on the other hand, like my site, you have only 1 other language, the menu echoed is a single option which highlights the other option.