Friday, October 16, 2009

Wikipedia uses the CLDR data

I travelled with Siebrand the other day and I learned that in order to provide plural support at, he uses the information in the CLDR to know what languages need plural support and in what way.

The amazing thing was that for some languages the plural support in MediaWiki is different from the one indicated by the standard. There are also a number of languages where the CLDR did not have information about their plural support.

It is vital that the CLDR and MediaWiki agree on how to provide plural support for languages. The CLDR is the standard and should be complete and correct because it exists for any application.

Wednesday, September 9, 2009

African Locales: completion deadline October 1

The African Network for Localization (ANLoc) is seeking immediate help to create Locales for 100 African languages. You can view a description of the project at

You can help in one of three ways:
-> volunteer to work on a locale yourself (the project will help you every step of the way!)
-> play matchmaker - introduce someone who can volunteer for their language
-> spread the word - pass along this message to your networks, so that we increase the chances of finding volunteers for many different languages

THIS YEAR'S DEADLINE to get new languages into the CLDR (Common Locales Data Repository), the international system used to produce all major software on the planet, is OCTOBER 1. So, we need to connect with people who speak languages from all over Africa. And, we need to complete each locale THIS MONTH.

The full list of languages currently in the project is at . If your favorite language shows any red in any of the bars next to it, please volunteer to help complete the locale!

It's easy to volunteer - just send an email to

The interface to build a locale in your favorite African language is available in English, French, and Swahili. Building a locale only takes a couple of hours. Please tell your friends, tell your colleagues, tell your networks!

A quick, true story - one Friday last month, someone in Nairobi took a couple of minutes to provide an introduction between the Locales project and a colleague of theirs working on the Kreole Morisyen language of Mauritius. A few emails were exchanged, and by Monday the Morisyen locale was 90% finished. By the end of that week, the locale was complete. On October 1, this locale will be submitted to CLDR. By early next year, Morisyen will be forevermore part of the universe of languages available for information technology development.

It just takes one person and a couple of hours to finish a locale for a language, but it takes a lot of villagers on the web to find that one person. Thanks in advance for volunteering, for introducing contacts, and/or for passing along this message!

Thursday, April 16, 2009

Africa helping itself on the Internet

In December I blogged about the Afrigen project. In this project people are asked to add CLDR information for their language. Now after some months there are results and, I am impressed. Many languages have made a start and the first languages have completed all the information that is looked for in this standard.

In my opinion having quality information in the "Common Locale Data Repository" is a litmus test for readiness of a language for the Internet. The Afrigen project makes completed data available in their subversion.

The CLDR itself distinguishes levels of CLDR support; this includes how lists are sorted, how numbers are written and how a few languages are called. For this project to insist on a complete set of data takes courage but is in my opinion the right thing to do.

There are people who say that a language is on the map when it has its own Wikipedia, in my opinion a complete set of CLDR data has a much wider application.

Monday, January 26, 2009

Unintended consequences

The fiu-vro Wikipedia is a language in the Võro language. People applied for an IS)-639-3 code recently, and this request was granted; the Võro language is now known under the vro code. This has changed the status of this project considerably. Where it used to be a project that existed because "things happened in those days", the language complies with all the requirements for a new project. We have started the process of renaming the message file for this project and, we have requested the rename of the project.

There is one glitch. The Estonian Wikipedia is known as The ISO-639-1 et code is connected to the ISO-639-3 est code, and this just became a macro language. Standard Estonian has been given its own code of ekk.

It is quite clear that technically it would be preferable to rename the Estonian Wikipedia. It can be done, this will be demonstrated with the rename of the Võro Wikipedia. From a community perspective it is not so clear cut. People are conservative, they do not like change and there are a lot of references out their to the Estonian Wikipedia.

For the Võro community, it is a badge of pride to have their own ISO-639-3 code. For the Estonian community it is a nuisance.