CLDR, the Common Locale Data Repository project from the Unicode Consortium, provides translated locale-specific information like language names, country names, currency, date/time etc. that can be used in various applications. This library, used across several platforms, is particularly useful in maintaining parity of locale information in internationalized applications. In MediaWiki, the CLDR extension provides localized data and functions that can be used by developers.

The CLDR project constantly updates and maintains this database and publishes it twice a year. The information is periodically reviewed through a submission and vetting process. Individual participants and organisations can contribute during this process to improve and add to the CLDR data. The most recent version of CLDR was released in September 2014.

An important part of the CLDR data are the rules that impact how plurals are handled within the grammar of a language. In CLDR versions 25 and 26, plural rules for several languages were altered. These changes have already been incorporated in MediaWiki, which was still using rules from CLDR version 24.

The affected languages are: Russian (ru), Abkhaz (ab), Avaric (av), Bashkir (ba), Buryat (bxr), Chechen (ce), Crimean Tatar (crh-cyrl), Chuvash (cv), Inguish (inh), Komi-Permyak (koi), Karachay-Balkar (krc), Komi (kv), Lak (lbe), Lezghian (lez), Eastern Mari (mhr), Western Mari (mrj), Yakut (sah), Tatar (tt), Tatar-Cyrillic (tt-cyrl), Tuvinian (tyv), Udmurt (udm), Kalmyk (xal), Prussian (prg), Tagalog (tl), Manx (gv), Mirandese (mwl), Portuguese (pt), Brazilian Portuguese (pt-br), Uyghur (ug), Lower Sorbian (dsb), Upper Sorbian (hsb), Asturian (ast) and Western Frisian (fy).

This change will have very little impact on our users. Translators, however, will have to review the user interface messages that have already been changed to include the updated plural forms. An announcement with the details of the change has also been made. The announcement also includes instructions for updating the translations for the languages mentioned above.

The CLDR MediaWiki extension, which provides convenient abstraction for getting country names, language names etc., has also been upgraded to use CLDR 26. Universal Language Selector and CLDRPluralRuleParser libraries have been upgraded to use latest data as well.

The Wikimedia Foundation is a participating organisation in the CLDR project. Learn more about how you can be part of this effort.

Further reading about CLDR and its use in Wikimedia internationalization projects:

  1. http://laxstrom.name/blag/2014/01/05/mediawiki-i18n-explained-plural/
  2. http://thottingal.in/blog/2014/05/24/parsing-cldr-plural-rules-in-javascript/

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering, Wikimedia Foundation