Wikimedia blog

News from inside the Wikimedia Foundation.org

Posts by GerardM

After the slush, the flood

after the slush, the flush

When new code does not find its way into production for quite some time, it tends to pile up. It is like with snow and when the time comes when it starts to thaw, it starts with a trickle, the trickles become a stream and all the streams rush down the mountain.

For the WMF Localisation team we worked on our documentation, our help system and our tests. We went to conferences in Belgium and India. And we worked on many small iterative improvements. We rolled out webfonts to more wikis. Input methods were improved and deployed as per requests. We have had our translation memory working on translatewiki.net for ages and now it is configured for use on the WMF wikis who use the Translate extension. Actually, we did experiment first with a new algorithm and we did configure one of the labs systems as a host for the memory of all the fine work we did and do.

Over time a lot of work went into things like plural rules. As the number of languages increase and as we support not only PHP but now also JavaScript, we are optimising our code and we are checking it again. We frequently find that a re-factoring is in order. It makes the code more elegant and easier to maintain. With added documentation and tests we ensure that we know it will work well.

Another fine project waiting to get to the stage where it will flow into our codebase is an updated Easy Timeline. The functionality has always been broken when used in many of  the “other” languages, languages written in a different direction, a different script.  The updated Easy Timeline has been given a revamp; it uses SVG to create the image and you can test it at translatewiki sandbox. Amir welcomes bug reports and LOVES to hear your comments

As you know, we use mingle for our project management (user guest, password guest). In it we have stories that explain the functionality that we are going to develop. Story 532 is one such:

As a potential translator, I want to be able to tell translation administrators in a structured way that I am interested in translating to one or more languages and at the same time provide them with some data about me and preferences on how and how often I would like to be contacted, so that translation administrators can more effectively and efficiently target translators.

Together with the acceptance criteria a narrative like this enables the developer to develop and the finished product to be accepted by our product manager. A story comes with tasks and once you have read the stories and the tasks you have a clue of what goes into getting you new functionality.

The conferences were great, we learn a lot from meeting so many wonderful people. Many tests are deployed and they run regularly. The documentation, including user documentation is written and we love you to translate many of them in your language. We feel really pumped up to get cracking and provide you with more functionality in the next sprint.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

The #MediaWiki #hackathon in Pune, #India

When good people get together in a friendly, well organised setting like this weekend in Pune, many great things happen. Several MediaWiki developers had come to provide the many people new to MediaWiki with their expertise and guide people into its inner workings.

Many people worked on Wikimedia mobile and the SmartPhone software, others worked on MediaWiki and its extensions. Bugs got fixed and functionality got extended.

One of the surprises was two people working on the localisation for the Mongolian language. The inclusion of a web font that will support the Dzonka language is another.

Dzongkha is the official language of Bhutan and according to Ethnologue, the script used is either Tibetan script, Uchen style or the Tibetan script, Umed style. These scripts and styles are also used for the Tibetan language, it is not only Dzongkha that stands to benefit.

One of the highlights of the work on the SmartPhone app is support for scripts that are written from right to left, this is now “beta” functionality. The result of more people looking at the code was that several bugs received the attention needed to make them go away. Scrolling was one area that got attention; this results in a smoother user experience.

New input methods have been created for Punjabi transliteration and for an Gujarati input method to be included in Narayam. The continued collaboration with RedHat engineers ensures that our work benefits both MediaWiki and RedHat/Fedora. We do realise that there is still a lot to do and it is not only documentation. Additional work was done on the “visual on-screen keyboard” that was started at the previous hackathon in Pune, it still needs more testing and design work.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Getting ready for when the freeze is done

When you look at the “sprint backlog” in mingle (guest, guest), you may notice that even though we have been slowed down because of the slush, the feature freeze because of the imminent MediaWiki release, we are not sitting on our hands. Documentation, testing, code review and outreach is on our agenda.

Because of the way we are planning, it is apparent how much code review actually gets done. This sprint we added a review of the ArticleFeedback extension for its internationalization and localization aspects. This is a logical development considering that, with 280+ languages, we are not developing for one language. Our objective for this job is: “As a user I can use the functionality of the ArticleFeedbackv5 so that nothing looks odd in my language from an internationalization and localization perspective”. Reviews like this have been performed informally in the past by translatewiki.net staff. This review, however, will be done during Wikimedia hours and reported through Wikimedia channels.

One old open bug is about EasyTimeline.  It started its life in 2005 and it is finally getting the attention it deserves. The bug explains the lack of support for languages like Arabic, Hebrew and Farsi that are written from right to left. The software has Ploticus as a dependency and for a long time the waiting was for a version of this software that does support RtL languages. We are not waiting any longer and you can read in our story 230 about the complexities involved.

You could say that implementing a translation memory for page translation is a bit more adventurous; it is however debatable if that functionality is new; a translation memory has for a long time been functional at translatewiki.net. It is also very much a feature that makes people more productive. Our team has always had the goal of making life easy and productive for our editors and translators.

The “grammar” functionality for JavaScript is part and parcel of the i18n tooling for our developers. It was not ready before the “slush” and it does make our lives difficult not having it available in the code. When you are building tests for “gender” and “plural”, it is so obvious to create them for “grammar” as well. In this sprint, “grammar” will be included in the code for all these good reasons.

This is the first time that there is a story for outreach. We are reaching out to all the Wikipedia language communities to have their own language support team. It will make a difference when all our language communities have been asked to provide their expertise to us. We already have found that many people show an interest and issues do get raised as a result.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

 

Tutorial for using the Translate extension

On Saturday 28 January 2012 at 20:00 UTC there will be a workshop on Translation tools. It will take between 60 and 90 minutes and will consist of an introduction of use cases and features, as well as a Q&A. (local times)

The workshop will focus on the use cases covered by the Translate extension on Wikimedia Meta-Wiki for the following user roles:

  • writers: those who write texts that need to be translated
  • translation administrators: those who mark pages for translations and post-process translations when they have been made

Please put the following page on your watchlist and write your name down if you would like to attend. The workshop is held online using WebEx. I would advise you to log in 15 minutes in advance to ensure you have ample time to set up your computer if you have not used WebEx before. WebEx can be used in desktop environments on Linux, OSX and Windows.

If you would like to familiarise yourself with the technology before the workshop, please take a look at the elaborate documentation, which includes some tutorials. In the next two weeks, the already present documentation for translators will also be completed.

Credit goes to Pete Forsyth for proposing to have this workshop. Hope to see you online Saturday!

Siebrand Mazeland
Product Manager Localisation
Wikimedia Foundation

The end of a slushed sprint

Consolidation was the name of the game for the past sprint for Wikimedia’s Localization team. A bug triage, testing, documentation and bug fixes were the activities designed to make our software more stable and more usable. When you read the bug triage report it becomes clear how much the devil is in the details; real native language expertise is needed to understand and assess the issues  we aim to solve. Read the report and you will see how much we rely on our community, on people like Srikanth and Nemo_bis.

Now that we are writing documentation in a central place, like here on the language statistics of the Translate extension, we are now able to provide you with a help text that is specific to the context. For the language statistics it is a help text about “statistics and reporting“. This functionality is ready but will become available in the deployment of January 30. You can help us and yourself by reading and understanding the text. Ask when you have questions and you can translate the text and make the text that much more your own.

Narayam is another extension that has been improved with user documentation. This documentation is completely new and it can effectively replace existing documentation. The existing documentation has the benefit of being written in the local language and we expect that what is written will be similar to the Narayam documentation. The language communities can then decide if they want to point to the local documentation. Like all our software, the Narayam documentation will be available for translation. Having the translation ready may be one of the considerations.

A lot of work is going into the description of the many input methods like the Inscipt layout for Assamese. These descriptions are “must have” help information when you do not know a particular keyboard layout by heart. They also provide a wonderful opportunity to verify if our implementation for a particular keyboard method is correct. This is yet another instance where native speakers can help us a lot.

Testing and coming to grips with the different tools was a major goal for this sprint. PHPunit and Qunit is what is used to test PHP and JavaScript and the tests developed are used in an environment called TestSwarm and Jenkins (respectively for PHP and JavaScript). As our team is so much into language support, we are learning what the limits are for testing for different languages and scripts.

All in all there may have been a slush and we have done a lot of code review, but we also managed to make sure that our functionality has gained stability for this and future releases. Additionally, work was done on grammar support for JavaScript, but the patch for that was stuffed in a bug report because of the slush, as the story was moved to the next sprint. Grammar support is what fills the gap in localization support between JavaScript and PHP and makes it available to any and all other developers.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

 

Sprinting ahead when there is a “slush”

When there is a code freeze or a slush, the potential for what is to be delivered is curtailed. It is official; you will not deliver new code, you will work towards consolidation of the new MediaWiki release.

One of the objectives for this and the next release is that the time between releases will decrease. Even though the Localization team works in two week sprints, it can help with getting the release out of the door. The first thing to do is help even more with code review, the other thing is make sure that its code will be optimised for easy coding, testing and use.

When you check out mingle, (user guest, password guest), you will find that the developers of our team are learning about the various testing tools. They are even updating the developer documentation to make it easier to understand how to set up new automated tests.

When you are testing, it is necessary that code provides information about its execution. This realization means that the code needs to be refactored in order to allow for testing. Documentation is another part of the puzzle that helps stabilise code; you will find a prodigious amount of documentation that is scheduled for this sprint.

All this translates in quite a minimal deployment for the first week. Its highlights are:

Translate:

  • Better error checking and handling in Special:Translate
  • Translatable page id prefix changed from page| to page-
  • Don’t reuse messages from core

WebFonts:

  • Fixed download of Vemana Telugu font
  • Added font for Ahirani (ahr)

Narayam: Some fixes to Assamese transliteration rules

Core: the cropping of text in level 1 headers is fixed for Indic languages

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Addressing the many

When you have a message, you use the appropriate language and tools to address multiple people. We do not use our eyes to see how many people we address and we do not use a bull horn to be heard. Our MediaWiki software knows the numbers involved and a plural enabled message will be formed according to the rules of the language.

When we implemented plural support for JavaScript, we checked our new implementation for plural with our implementation in PHP and we checked against the standard for such things, the CLDR.

The Localisation team does not know the language rules for the 280+ languages that have a Wikipedia. We prefer to implement what the standard tells us but we support more languages than the CLDR. We want to channel our need for support through “Language Support teams” and we want them to help us understand  and fix the inconsistencies and add the missing information to the CLDR.

Inconsistencies with the CLDR
  • Belarusian – ‘other’ form missing in MediaWiki
  • Belarusian-tarask – ‘other’ form missing in MediaWiki
  • Bosnian – ‘other’ form missing in MediaWiki
  • Manx - CLDR has 3 , MediaWiki has 4 forms
  • Hebrew – CLDR has 2, MediaWiki has 3 forms
  • Croatian – ‘other’ form missing in MediaWiki
  • Ripoarian / Colonian – order of forms different. CLDR says 0,1, other. MediaWiki says 1,other,zero
  • Latvian – CLDR defines zero, one , other forms. MediaWiki has only two forms, one for (1, 21, 31, 41, 51, 61…) and another for rest of the forms.
  • Macedonian – CLDR defines forms[0] for n!=11. MediaWiki defines forms[0] for n%100!=11
  • Polish: ‘other’ form is not defined in MediaWiki.
  • Russian : CLDR defines 4 plural forms. Form with decimals missing.
  • Slovenian – MediaWiki defines a zero form which is not present in CLDR
missing in CLDR
  • Church Slavonic
  • Lower Sorbian
  • Scottisch Gaelic
  • Upper Sorbian

Please make a difference for the support for your language and join the Language support team.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

End of sprint 6; Translate and other goodies

Every two weeks a sprint and every week a deployment. The Localisation team aims to bring you new and updated functionality when we have it.

As you can see in the summary below, the focus this sprint has been very much on the Translate extension. Management of translations and the translation process is what we have worked on. When texts are translated in a Wiki, they often are only needed within a specific time frame; it is now possible to mark a text as no longer needing any effort. For many languages there are multiple people involved in the work flow for the creation of a document that is well written in translation. When they are to work well together, it helps when their work changes its state so that it is clear that for instance something has been proofread.

The person who manages the publication and distribution of a page needs work flow states to decide what more needs to be done and what is ready. To do this he can make use of states that already exist or define additional states. These states are available as local messages and are available for translation.

Translate extension features

  • Message work flow states help translator translate, review and making ready for publication
  • There is now a new message group for recent translations. This message group makes these states possible in translation
  • Special:MyLanguage can now be used with language sub pages to be used as the default fall-back instead of providing an untranslated version
  • Pages marked for translation can now be marked as “discouraged”. They will no longer show up in the usual places. This prevents translators from translating them needlessly.
  • Added {{#translationdialog:title}} for creating a link to the translation dialogue

Translate bug fixes

  • The flash of unstylized content effect is reduced
  • Made the extension work without legacy JavaScript globals
  • The summary row in Special:LanguageStats and Special:MessageGroupStats is no longer sorted with rest of the rows.
  • Fixes to the sizing of the translation editor dialogue
  • Fixed a fatal error that sometimes occured when translation page title used GRAMMAR and the page was viewed with English UI.

Miscelaneous changes

  • Parserfunctions ifexist magic word Italian translation fixed to ‘ifexist’
  • Narayam preference wording changes from disable to enable
  • The WebFonts icon no longer overlaps with the menu text
  • WebFonts preview allows you to preview a text with a font. You can download these freely licensed fonts to your system.
  • GENDER and PLURAL support are now available for use in JavaScript.
  • Consistence updates for grouppage-* messages, for LocalisationUpdate
  • Fixing be-tarask grammar forms

Changes deployed last week

  • WebFonts was deployed for the Bishnupria Manipuri language; it uses the Lohit Bengali font
  • Support for gendered name spaces was deployed for the Russian wikis.

As always, you are welcome to have a look at our sprint backlog (user:guest password: guest) and bug us in bugzilla with whatever needs fixing.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

The localisation team sprints into the new year..

WebFonts is the first extension that gets user documentation served from MediaWiki.org. At the time of writing, the documentation has been written, it does serve people with help text about WebFonts and it is ready for translation. People looking for help will be served help in the language of their user interface if there is a translation.

WebFonts drop down on or.wikipedia.org

In a way it seems like a minor thing but consider;

  • MediaWiki can serve help texts for its functionality
  • this help text may differ based on the language of the user
  • the help text can be translated
  • a new community for MediaWiki help text translation is needed
  • functionality like Narayam will surely get its user documentation in the near future

It will be a challenge to other developers and developer teams to adopt and refine the way assistance to our users is provided. We learned at translatewiki.net that documentation did improve the quality of the localisations. We hope that user documentation will reduce confusion and makes for happy editors and readers.

The WebFonts user documentation was deployed last Tuesday. This and some other changes can be found in the deploy list. As the holiday season is in full swing, sprint 6 has started; it will run into the new year.

In this sprint stories will be developed that will make “Translation review” feature complete. When this is implemented, it will help translators and localisers review each others work and assign a status to their work for further considerations. As you can imagine, the different statuses themselves will become available for translation; card 326 defines this and will make this possible. This is just one of many stories that make up this feature.

For the localisers of the MediaWiki software a long held ambition will be realised; card 206 will see “plural” support implemented for JavaScript. When this functionality is deployed, it will result in a long list of future changes that will see changes to the actual messages.

The new year will bring us many new challenges and opportunities to the many many language communities. The Wikimedia Localisation team will work hard to provide you with the tools to be efficient in any language to get our message out and provide information in any language. For some of us the new year starts at a different moment so it will be very much business as usual; we welcome you to have a look at our sprint backlog (user:guest password: guest) and bug us in bugzilla with whatever needs fixing.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

 

Localisation team sprint 5 update II

Probably the most interesting highlight of today’s i18n deployment is the configuration of the Translate extension on MediaWiki.org. We have observed that on some wikis special pages exist that explain in the language of the Wiki functionality like Narayam or WebFonts. Such documentation is welcome on all MediaWiki installations where the functionality is used by people using the same language for their user interface.

For writing the documentation MediaWiki.org is the obvious platform. With the deployment of Translate we have the basis for writing and translating user documentation in a structured and organised way.

Narayam and WebFonts have been updated to the latest versions that have been tested on translatewiki.net. As Narayam and WebFonts are still very much a work in progress, we invite anyone to continue their testing at translatewiki.net . The changes are:

  • menu appears only on click, not when hovering
  • menu positions are now correct for RTL languages and do not go off screen any more
  • Narayam and Webfonts support the Kannada script for the Tulu language on the Incubator

There are also some smaller fixes among them the change of the autonym for the Veps language to “Vepsän kel”.. The full details for all the changes is at revision 106667.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant