Multilingual Jekyll: How to run your Jekyll site in more than one language

August 3, 2014 · 8 minutes read
Jekyll Multilingual

As some of you might be aware, English isn’t my native language. Although I pretty much prefer English over German (no matter what or where: movies, books, recipes, podcasts, …) not everyone in Germany can read/speak it as fluently as I can, if at all. So, naturally almost all my sites are available both in German and English, although most of the content is either/or.

This, of course, is not as trivial as it may sound. Even with blogging/site engines and CMSs such as Wordpress, having the same site and the same content is a hassle at best and damn near impossible at worst. Especially Wordpress, as powerful and well maintained as it may be, does a really terrible job when it comes to maintaining content and pages in more than one language. It’s a hack, and you feel it.

To be honest, one of the reasons for migrating to Jekyll was the fact, that its simplicity makes it easier for me (as a developer) to create the site I want. Instead of hacking around in the huge pile of junk that is Wordpress to get it barely working the way I’d like it to work, I can focus on creating content.
Still, even with Jekyll being as awesome as it is, it’s nowhere easy to get a multilingual site going. In fact, Jekyll itself doesn’t even support it and that’s fine: most people don’t need this advanced feature, anyway.
Further, the existing Jekyll plugins that promise to enable you to post in more than one language barely work at all or do things in a very weird way.

So, I decided to roll out my own solution. It was easy enough, even without knowing all the intricate details about how Jekyll does his magic…
My requirements were:

  • I want to be able to create both posts and pages in more than one language
  • A post/page can be published in a single language only.
  • A single language post should still show up for other languages but be marked as being in a different language
  • If a page/post is multilingual only the selected language should be shown.
  • It should require as little changes to the project (and especially the directory structure) as possible.
  • The entire site has to be translated. That includes pages created by other plugins

Sounds pretty straight forward, doesn’t it?
So, how did I do it?

Before we get to the technical details, let me show you how I post a single and a multilingual post. I’ve got two places where a language can be defined: either in the file name (something like my-post.de.md) or in the file’s front-matter (language: de).

I use the first option to define a multilingual post. Each language’s post is defined in its own file, but they should share the same prefix/slug so you can link to the same post in different languages by changing the language prefix (more on that later).

The second option is used for posts/pages in a single language. the header defines the primary (and only) language for this post. In this case, the filename doesn’t contain any language tags. My plugin then creates copies of this post for the other languages.

The plugins

Which brings me to the plugin, or to be more precise the two plugins, that make this work.
We’ll focus on the first plugin (called “LanguageGenerator”) for now, as it’s the one dealing with languages and posts/pages. The second plugin will come into play when we talk about templates and translations.

The LanguageGenerator plugin does three things. First it monkey patches Jekyll’s post and page classes to provide the correct urls for the page’s/posts’s language. The language is either extracted from the file name or it uses the one provided in the file’s front-matter. It generates URLs that look like this: /{language}/{slug}/, e.g. /en/2014/08/03/my-awesome-post/

Further, the plugin generates copies of single language pages/posts (for the other languages) or for pages/posts that didn’t define any language. Like old posts or pages generated by other plugins (using the first configured language as the default).

Last but not least, it provides a Tag that generates an icon/text representation of your post’s language (for templates) and a Filter to filter out posts that aren’t in the desired language.

This brings us to the most difficult part of this whole process: dealing with other plugins.
Naturally, most of the other (third-party) plugins don’t support any kind of localization, let alone my very own version of it. This also means, that you have to incorporate the pages generated by those plugins into the multilingual workflow.
Luckily, the approach I descibed above (copying pages that don’t have a language defined) covers the majority of the plugins I’m using. For the most part, making the templates used by those plugins work with multiple languages is enough.
Only a select few plugins (e.g. the categories and author page plugins) either don’t use templates and have inline HTML code, or mess with Jekyll in another way (like calling page.render when they really shouldn’t). Those plugins need to be patched, which isn’t always easy, but completely doable. I’ll detail those changes in a later post. If you need help. feel free to contact me.

The config

Configuring the desired languages is pretty easy:
Just put languages: ["en", "de"] in your _config.yml and you’re good to go. The first language will be your primary (i.e. default) language. I haven’t really tested my approach with more than two languages, but in theory it should work with more than that.

Templates, templates, templates

Now that the plugins are in order (we haven’t really talked much about the second plugin, but we will now), all that is left are all the templates: the base templates that define the frame of your site, the templates for posts, the ones required for third-party plugins and pages that are HTML-only (like you index.html).

To make them work for multiple languages, there are five jobs to be done:

  1. change any (static) URL to use the correct language prefix (e.g. /en or /de).
  2. translate any static text.
  3. provide a way to switch the language.
  4. provide a way to show the language a post was written in (if it’s a single-language post not written in the currently selected language).
  5. filter out posts that aren’t in the correct language for the page that is currently generated.

Step one is pretty straight-forward:
Each post/page object now contains a language property. All you need to do is search for URLs in your HTML code and prefix them with {{ page.language }} when it’s a link to another page, or {{ post.language }} when it’s a link to a post.
Like that:
{% codeblock lang:html %} {% endcodeblock %}

For step 2, we need my second plugin called TranslationPlugin. Its job is to translate static text. I “borrowed” most of the code for that plugin from the Jekyll Multiple Languages Plugin. You can check out their documentation to learn how this works.
Here’s the gist of it: To translate a particular string in your HTML-code, you replace it with a special tag.

Before:
{% codeblock lang:html %} About {% endcodeblock %}

After:
{% codeblock lang:html %} {% t navigation.about %} {% endcodeblock %}

The translations are done in language-specific YML files located in a special folder called _i18n inside your project’s source folder.

The english file (en.yml) looks like this:
navigation: about: About

Step 3 again is pretty straight-forward. As you can see if you scroll down to the footer of this site, I’ve simply added a set of flags that link to the same page in a different language. Here’s the template: {% codeblock lang:html %} {% raw %}

To indicate, that a post was written in a different language (step 4), the LanguageGenerator plugin provides a handy filter that contains three methods:

  • language_flag(post) returns a HTML image tag containing the flag icon of the posts main language.
  • language_text(post)just returns the language info as text, e.g. “(en)”.
  • isNonMultilingualPostInDifferentLanguage(post)just returns a boolean indicating whether this post was written in a different language than the current page’s one.

Here’s how they’re used (post title):
{% codeblock lang:html %}

That leaves us with step 5: filtering the array of posts to only get the ones that are either written in the language that Jekyll is generating right now or are single-language posts (remember: Jekyll will use the same template to generate pages for all your defined languages).
To achieve this, the LanguageGenerator plugin provides another template tag called language_array that will do exactely this. It takes a list of posts and the desired language as an argument and stores the posts that match (or are written in a single language) in a new array:
{% codeblock lang:html %} {% raw %} {% language_array site.posts language_posts page.language %} {% for post in language_posts limit: site.recent_posts %} … your post template … {% endfor %} {% endraw %} {% endcodeblock %}

That’s all there is to it.

Now, creating multilingual posts in Jekyll is as easy as placing a file in the _posts folder.

The code for my plugins can be found on Github.
If you have any questions, feel free to open an issue or send me a message. And if you need more of my time (e.g. to set up your own Jekyll/Octopress sites for your company): I’m available for hire.