L10N and I18N Best Practices

DevToolsGuy / Wednesday, January 25, 2017

This post will focus on localization and internationalization in software. It will provide some guidance on best practices, as well as what to avoid. If you haven't read my previous blog post introducing Localization and Internationalization, you can read it here.

Best Practices

Get Started Early

Don't write your application and then rewrite it to be internationalized. Save yourself some time and plan this from the beginning. Get the project manager on board with this and make sure the specification research is done for all target locales. If you need to get stakeholders to approve, explain the benefits of internationalization with regards to market expansion and user experience. This means more sales because there are more potential customers. If that doesn't do it, write a sample app or make a prototype that's in another language, and tell them to use it.

Flexible Layout

As mentioned in the previous post, text length varies between languages. There are 2 major options here:

  1. Adjust the layout for each locale where necessary.
  2. Allow enough space to fit all target languages. You can place labels above entry fields to provide more space, and right-align text in front of inputs if they need to be on the same line.

Support For Non-English Input

Not all languages use the A-Z alphabet. Make sure that your application supports non-English characters. For example, there are characters with accents above letters, and there are also languages that require multiple keystrokes to input one character (IME input). Make sure that support for the target languages is a development requirement and test it early.

Unicode Encoding

Ever see a message show up as all "?" or blocks ([])? Chances are that the page is incorrectly encoded. It's recommended to use Unicode (UTF-8) encoding in your files. In some applications the default is ANSI or other encodings based on your region, so be sure to check!

Culture-Specific Formatting

Numbers, dates, times, currencies, and even calendars can have different formats. For example, some cultures use "." to separate thousands from hundreds in numbers. Many will use different currency symbols. Most cultures use military (24-hour) time instead of 12-hour with "AM" and "PM". Dates vary widely. The common format in Europe is day/month/year, where in Japan it is year/month/day. There are even different types of calendars, like the imperial calendar in Japan and a Hebrew calendar in Israel which are both used in government.

Allow for Right to Left

If one of your target locales uses right-to-left layout, make sure that your UI accounts for that.

Native Speaker Review

Have a native speaker review the localized application. This will help to catch any internationalization or localization issues that were missed, and check that the localized product makes sense and is usable. If you have any concerns about the localization, this is the time to raise them and get feedback.

What to Avoid

Hardcoded Text

To make sure the application is properly internationalized, you can substitute all text and applicable images with placeholders (like Lorem Ipsum for text and sample images), then provide it for testing at an early state. If they find something that's not placeholder, or something that doesn't work correctly with the placeholder content, then you're not fully internationalized. This is easy to do if your strings (and other resources where necessary) are externalized to separate files (like .resx in .NET). Then you can simply create a copy of the file with the placeholder content and test with that locale applied. This way you don't lose any of your real content, and when it has passed testing, you can localize it in-house or outsource it to a language vendor. For example, you might have a string equality check in your code-behind that will only work for the default locale. It's best to catch those things early.

Offensive Content

Different cultures will interpret things differently. Colors and symbols have different meanings. There are also cultural sensitivities, like war history and territory disputes. You can customize content to fit the target culture.

Text in Images

If possible, don't put text in images. This way there is a higher chance that the image can be reused. If you need text, you can overlay it with the image so that the image doesn't have to be changed.

Concatenated Strings

Word order is different between languages. Don't count on it being the same, or else you'll get unintelligible translations. For example, given a string variable called "favoriteColor", a sentence like ["My favorite color is " + favoriteColor] won't work in all languages. The color word won't always come at the end of the sentence. String interpolation can be used here as an alternative, so that the placeholder can be moved in the translation, and it's clear that something is appended. This would instead become ["My favorite color is {0}"], so the {0} could be moved wherever in the sentence it is needed.

Spaces For Layout

Never use additional spaces in labels to adjust where text appears. Use text alignment properties instead. The reason why spaces won't achieve the desired result is that the text length is different between languages, and it will be frustrating to figure out how many spaces are needed in each language based on trial and error. This may also differ based on OS or browser default font settings.

Confusing Content

Make your content easy to understand. If you use a language vendor, you want to make their job easy so that the final product is high quality. Using slang or abbreviations in content increases the risk of misunderstanding, and can result in poor localization. If there are space limitations that require abbreviation, the design could be modified, or there should be a comment left in the externalized content to explain the restriction. Additionally, if there are ambiguous words, leave a comment to explain the usage. For example, the word "last" could be a verb ("to remain"), an adjective meaning "final", or an adjective meaning "previous". If there is a label in your application with only this word, it would be good to either clarify it in the label, or leave a comment in the resource file.