Fun fact of the day: about 37% of WordPress downloads are for non-English, localized versions.

So as a plugin or theme author, you should be thinking of localization and internationalization (L10N and I18N) as pretty much a fact of life by this point.

Fun total guess of the day: based on my experience in browsing through the thing, roughly, ohh… all plugins and themes in the directory are doing-it-wrong in some manner.

Yes friends, even my code is guilty of this to some degree.

It’s understandable. When you’re writing the thing, generally you’re working on the functionality, not form. So you put strings in and figure “hey, no biggie, I can come back and add in the I18N stuff later.” Sometimes you even come back and do that later.

And you know what? You probably still get it wrong. I did. I still often do.

The reason you are getting it wrong is because doing I18N right is non-obvious. There’s tricks there, and rules that apply outside of the normal PHP ways of doing things.

So here’s the unbreakable laws of I18N, as pertaining to WordPress plugins and themes.

Note: This is not a tutorial, as such. You are expected to already be translating your code in some way, and to have a basic grasp on it. What I’m going to show you is stuff you are probably already doing, but which is wrong. With any luck, you will have much slapping-of-the-head during this read, since I’m hoping to give you that same insight I had, when I finally “got it”.

Also note: These are laws, folks. Not suggestions. Thou shalt not break them. They are not up for debate. What I’m going to present to you here today is provably correct. Sorry, I like a good argument as much as the next guy, but arguing against these just makes you wrong.

Basic I18N functions

First, lets quickly cover the two top translation functions. There’s more later, and the laws apply to them too, but these are the ones everybody should know and make the easiest examples.

The base translation function is __(). That’s the double-underscore function. It takes a string and translates it, according to the localization settings, then returns the string.

Then there’s the shortcut function of _e(). It does the same, but it echoes the result instead.

There’s several functions based around these, such as esc_attr_e() for example. These functions all behave identically to their counterparts put together. The esc_attr_e() function first runs the string through __(), then it does esc_attr() on it, then it echo’s it. These are named in a specific way so as to work with existing translation tools. All the following laws apply to them in the exact same way.

So, right down to it then.

Law the First: Thou shalt not use PHP variables of any kind inside a translation function’s strings.

This code is obviously wrong, or it should be:

$string = __($string, 'plugin-domain');

The reason you never do this is because translation relies on looking up strings in a table and then translating them. However, that list of strings to be translated is built by an automated process. Some code scans your PHP code, without executing it, and pulls out all the __()’s it finds, then builds the list of strings to be translated. That scanning code cannot possibly know what is inside $string.

However, sometimes it’s more subtle than that. For example, this is also wrong:

$string = __("You have $number tacos", 'plugin-domain');

The translated string here will be something like ‘You have 12 tacos’, but the scanning code can’t know what $number is in advance, nor is it feasible to expect your translators to translate all cases of what $number could be anyway.

Basically, double quoted strings in translation functions are always suspect, and probably wrong. But that rule can’t be hard and fast, because using string operations like ‘You have ‘.$number.’ tacos’ is equally wrong, for the exact same reason.

Here’s a couple of wrongs that people like to argue with:

$string = __('You have 12 tacos', $plugin_domain);
$string = __('You have 12 tacos', PLUGIN_DOMAIN);

These are both cases of the same thing. Basically, you decided that repetition is bad, so you define the plugin domain somewhere central, then reference it everywhere.

Mark Jaquith went into some detail on why this is wrong on his blog, so I will refer you to that, but I’ll also espouse a general principle here.

I said this above, and I’m going to repeat it: “that list of strings to be translated is built by an automated process“. When I’m making some code to read your code and parse it, I’m not running your code. I’m parsing it. And while the general simplistic case of building a list of strings does not require me to know your plugin’s text domain, a more complicated case might. There are legitimate reasons that we want your domain to be plain text and not some kind of variable.

For starters, what if we did something like make a system where you could translate your strings right on the wordpress.org website? Or build a system where you could enlist volunteer translators to translate your strings for you? Or made a system where people could easily download localized versions of your plugin, with the relevant translations already included?

These are but a few ideas, but for all of them, that text domain must be a plain string. Not a variable. Not a define.

Bottom line: Inside all translation functions, no PHP variables are allowed in the strings, for any reason, ever. Plain single-quoted strings only.

Law the Second: Thou shalt always translate phrases and not words.

One way people often try to get around not using variables is like the following:

$string = __('You have ', 'plugin') . $number . __(' tacos', 'plugin-domain');

No! Bad coder! Bad!

English is a language of words. Other languages are not as flexible. In some other languages, the subject comes first. Your method doesn’t work here, unless the localizer makes “tacos” into “you have” and vice-versa.

This is the correct way:

$string = sprintf( __('You have %d tacos', 'plugin-domain'), $number );

The localizer doing your translation can then write the equivalent in his language, leaving the %d in the right place. Note that in this case, the %d is not a PHP variable, it’s a placeholder for the number.

In fact, this is a good place to introduce a new function to deal with pluralization. Nobody has “1 tacos”. So we can write this:

$string = sprintf( _n('You have %d taco.', 'You have %d tacos.', $number, 'plugin-domain'), $number );

The _n function is a translation function that picks the first string if the $number (third parameter to _n) is one, or the second one if it’s more than one. We still have to use the sprintf to replace the placeholder with the actual number, but now the pluralization can be translated separately, and as part of the whole phrase. Note that the last argument to _n is still the plugin text domain to be used.

Note that some languages have more than just a singular and a plural form. You may need special handling sometimes, but this will get you there most of the time. Polish in particular has pluralization rules that have different words for 1, for numbers ending in 2, 3, and 4, and for numbers ending in 5-1 (except 1 itself). That’s okay, _n can handle these special cases with special pluralization handling in the translator files, and you generally don’t need to worry about it as long as you specify the plural form in a sane way, using the whole phrase.

You might also note that _n() is the one and only translation function that can have a PHP variable in it. This is because that third variable is always going to be a number, not a string. Therefore no automated process that builds strings from scanning code will care about what it is. You do need to take care than the $number in _n is always a number though. It will not be using that $number to insert into the string, it will be selecting which string to use based on its value.

Now, using placeholders can be complex, since sometimes things will have to be reversed. Take this example:

$string = sprintf( __('You have %d tacos and %d burritos', 'plugin-domain'), $taco_count, $burrito_count );

What if a language has some strange condition where they would never put tacos before burritos? It just wouldn’t be done. The translator would have to rewrite this to have the burrito count first. But he can’t, the placeholders are such that the $taco_count is expected to be first in the sprintf. The solution:

$string = sprintf( __('You have %1$d tacos and %2$d burritos', 'plugin-domain'), $taco_count, $burrito_count );

The %1$d and such is an alternate form that PHP allows called “argument swapping“. In this case, the translator could write it correctly, but put the burritos before the tacos by simply putting %2$d before %1$d in the string.

Note that when you use argument swapping, that single-quoted string thing becomes very important. If you have “%1$s” in double quotes, then PHP will see that $s and try to put your $s variable in there. In at least one case, this has caused an accidental Cross-Site-Scripting security issue.

So repeat after me: “I will always only use single-quoted strings in I18N functions.” There. Now you’re safe again. This probably should be a law, but since it’s safe to use double-quoted strings as long as you don’t use PHP variables (thus breaking the first law), I’ll just leave you to think about it instead. 🙂

Law the Third: Thou shalt disambiguate when needed.

When I say “comment” to you, am I talking about a comment on my site, or am I asking you to make a comment? How about “test”? Or even “buffalo”?

English has words and phrases that can have different meanings depending on context. In other languages, these same concepts can be different words or phrases entirely. To help translators out, use the _x() function for them.

The _x() function is similar to the __() function, but it has a comment section where the context can be specified.

$string = _x( 'Buffalo', 'an animal', 'plugin-domain' );
$string = _x( 'Buffalo', 'a city in New York', 'plugin-domain' );
$string = _x( 'Buffalo', 'a verb meaning to confuse somebody', 'plugin-domain' );

Though these strings are identical, the translators will get separated strings, along with the explanation of what they are, and they can translate them accordingly.

And just like __() has _e() for immediate echoing, _x() has _ex() for the same thing. Use as needed.

Finally, this last one isn’t a law so much as something that annoys me. You’re free to argue about it if you like. 🙂

Annoyance the First: Thou shalt not put unnecessary HTML markup into the translated string.

$string = sprintf( __('<h3>You have %d tacos</h3>', 'plugin-domain'), $number );

Why would you give the power to the translator to insert markup changes to your code? Markup should be eliminated from your translated strings wherever possible. Put it outside your strings instead.

$string = '<h3>'.sprintf( __('You have %d tacos', 'plugin-domain'), $number ).'</h3>';

Note that sometimes though, it’s perfectly acceptable. If you’re adding emphasis to a specific word, then that emphasis might be different in other languages. This is pretty rare though, and sometimes you can pull it out entirely. If I wanted a bold number of tacos, I’d use this:

$string = sprintf( __('You have %s tacos', 'plugin-domain'), '<strong>'.$number.'</strong>' );

Or more preferably, the _n version of same that I discussed above.

Conclusion

Like I said at the beginning, we’ve all done these. I’ve broken all these laws of I18N in the past (I know some of my plugins still do), only to figure out that I was doing-it-wrong. Hopefully, you’ve spotted something here you’ve done (or are currently doing) and have realized from reading this exactly why your code is broken. The state of I18N in plugins and themes is pretty low, and that’s something I’d really like to get fixed in the long run. With any luck, this article will help. 🙂

Disclaimer: Yes, I wrote this while hungry.

Shortlink:

113 Comments

  1. Google translate will ruin your sprintf phrase.

    Покупка %s штук => Buying % s pieces.

    So tthe last rule is DONT USE GT for sprintf phrases ))

    • Well, if you are seriously translating your website, you shouldn’t be using Google Translate anyway.
      If not, you’re better off using one of those front-facing automatic Google Translate plugins. At least they’ll have proper text to work with.

  2. We ran into many problems fully internationalizing our application. In ours, the verbiage may change depending on language and we needed our grammar to be correct as it pertained to legal verbiage.
    Ended up using namespace keys to separate the text and various files for the i18n packages. We also have a 3-layer override system – base, operation, tenant implementation, so customizations can be made to components when needed.
    File ‘base/i18n/en.js’ would contain something like:

    {
    my.component.screen_name.title : "My Screen Title",
    my.component.screen_name.instructions : "Do stuff correctly!"
    }

    File ‘myOperation/i18n/en.js’ would contain something like:

    {
    my.component.screen_name.title : "My Special Screen Title",
    my.component.screen_name.help_text : "You seem to be doing stuff wrong!"
    }

    and so on…

    Tenant implementation would override the operation level, and operation level could override base level implementations. Recommended that all ‘components’ of the system have base i18n entries – only things that are purely tenant or operation specific wouldn’t exist in those.
    When an operation is launched, these files are merged into a flat runtime i18n, based on which language is currently selected. This is done so operations could utilize common components as needed.

    These keys would then be used inline on html templates – for my example I’ll use handlebars:

    {{i18n my.component.screen_name.title}}

    We have plans to move this to a noSQL type storage to allow an admin tool to update/create tenant implementations without development involvement.

  3. I’m not sure I understand the first rule. First of all, using a variable in a translation function seems to work(the value of the variable is translated). If I were to follow the rule, how the hell am I supposed to translate strings that are not known at parse time? Can I only translate the “hardcoded” part of templates?

    • You don’t translate things not known at parse time. This is not machine translation like running something through Google Translate, this is translation by people, working from a list of strings.

      Basically, translation works by taking a big list of strings, translating them into another language, and then using a lookup table to replace the original strings with the new translated ones when the program runs. A variable can’t be translated because it can be anything. Our lookup table won’t have it’s value in the original list of strings.

      Yes, you can only translate things hardcoded into the template, because you’re not translating it at runtime, you’re translating it before hand and putting all those translations into a *.MO file.

  4. Appreciate the article. While I realise it’s an old article, I too am having a hard time wrapping my head around rule one. In particular when it comes to translating content retrieved from get_bloginfo(), as the following would clearly violate said rule:

    $name = get_bloginfo( 'name' );
    $string = __( $name, 'plugin-domain' ); // this would be incorrect
    $string = sprintf( __('%s', 'plugin-domain'), $name ); // this would be 'correct' but doesn't work
    

    What would be the correct way of doing this? Thanks.

    • There is no correct way of doing that. You cannot translate things that you do not know in advance. The “name” of the blog, which you are getting from get_bloginfo, is not translatable by the i18n system, because you don’t know what it is in advance.

      The translation system works by building a big list of strings. You translate that list of strings, then replace the originals with them. Can’t build a big list of every possible string though, can you?

      This isn’t running through Google Translate or something here. This is not dynamic translation. It’s static. The translation is done by hand, by people. Not by an automated system.

    • You use internationalization to make all hard-coded things of a theme’s front-end presented in a different language, such as “leave a comment”, “There are n comments to this article”, date formats, etc.

      The name of the blog is put there by the owner, and you don’t even try to translate that, you just echo whatever the site’s owner put in.
      In case of a multilingual site, you don’t translate these things, neither. WPML / QTranslate-X and similar solutions will give the user possibility to enter different language versions for his/her content in articles, pages, widgets, etc. In addition, the multilingual plugin will make sure to swap the language settings correctly, so that the themes and plugins will also present their part of the page in the correct language.

  5. Hi,
    I have used the plugin for calculation fields but it’s not translating now i have used the file po downloaded. can anybody guide me how to upload that file in the calculation form filed to convert that form to the arabic.

  6. Hi,

    How would you use _n for this example? you might have 0 tacos and 10 burritos, or 1 taco and 1 burrito, or 10 tacos and 1 burrito, or .. (ok you get it.)

    $string = sprintf( __('You have %1$d tacos and %2$d burritos', 'plugin-domain'), $taco_count, $burrito_count );
    
    • There is essentially no good way to do it. Multiple plurals in a sentence doesn’t work and doesn’t lend itself well to this type of translation. Realistically, I’d rewrite the sentence to not have that problem in the first place. Simplify it. Maybe make it a list or something.

    • Of course, this is a tricky one, since you need to put each item in the right form, depending on the count of each one. Perhaps one should format it as a shopping list?
      1. You have the following (%d item/s) in your cart:
      2. %d taco/s
      3. %d burrito/s

  7. Hello, I kindly ask your help, I’ve posted this question on stackoverflow but not gatting any help, can you drive me in the right direction?
    Basically I don’t understand how to let translations working in an plugin addon merged in a free one..
    https://stackoverflow.com/questions/45220890/wp-localization-how-to-translate-2-merged-pugins

  8. What is the correct way of dealing with bold and emphasized text in translatable strings?

    Should you just leave the <b> and <i> tags in the string? I cannot think of any other way.

    • Generally speaking, yes, those are emphasis in some way, and thus part of the phrase itself. There may be need to include them in the translation because different languages will emphasize different parts of the phrase.

  9. […] Have you know? 45% WordPress downloads are for non-English. Making your Theme translation ready is strongly recommended. Read introduction about Internationalization in WordPress Handbook first. I prefer Otto’s the pitfalls of i18n. […]

  10. […] role of the text domain when internationalizing WordPress plugins and themes. This topic has been addressed in the past, but it comes up again from time to time. Time to re-address […]

  11. Man this was a big help. Thank you!

  12. Great article, essential reading for anyone starting to do i18n on WordPress or in general.

    I would like to add my distaste for sprintf as the token replacer though. In my experience translators rarely understand the arcane placeholders, especially when there’s more than one.

    I now use PHP’s ‘strtr’ function, which allows for much more obvious replacements for translators to deal with.

    For example, “Shipped %1$d on %2$s” conveys nothing and is ripe for errors in typing, whereas “Shipped [QUANTITY] on [DATE]” (or similar) is clearer for everyone, and is much easier for the translator to switch around if the grammar demands it.

    There’s an answer on Stack Overflow here with more examples: https://stackoverflow.com/a/60182650/

Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Need to post PHP code? Wrap it in [php] and [/php] tags.