Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts by Srikanth Lakshmanan

OpenSource Language Summit

The Wikimedia Foundation and Red Hat co-organized an Open Source Language Summit in Pune, India on November 6-7, 2012. The summit focused on language tools and technology development to support languages on Wikipedia, the Web, Linux and other Open Source platforms.

Santhosh Thottingal presenting his talk on jquery.ime

In total, 45 core language technology developers, open source contributors, typographers and technology evangelists from the Wikimedia Language Engineering and Mobile teams, Red Hat, Mozilla Foundation, KDE, GNOME, translatewiki.net and other open source projects participated in sessions and work sprints on internationalization and localization features supporting various open source projects on the web and Linux. After brief introductory talks, we focused our work on font support, input method tools, language search, and web and localisation standards.

Highlights: 

The event had short talks on the following topics:

Selected achievements

The following people won prizes for their code contributions during the event:

  • Anish Patil ported Universal Language Selector’s cross-language search algorithm to gnome language search
  • Aravinda VK wrote a set of font-forge python wrappers to make changes to fonts programmatically. Aravinda fixed a few bugs in Kannada Gubbi font for Harfbuzz rendering engine and also wrote Kannada KGP keymap for jquery.ime
  • G Karunakar added Hindi inscript keyboard layout to Firefox OS GAIA

Other accomplishments included:

  • Kushal Das added patches to deploy Universal Language Selector on http://www.mozilla.org and also a patch for a bug on Mozilla localization platform.
  • Alolita, Sankarshan, Runa, Satish worked on discussing APIs for various translation workflows and putting together an initial specification.
  • Rajeesh Nambiar, Hussain KH, Ani Peter, Praveen A and Pravin Satpute fixed and filed upstream bugs for Malayalam, Kannada, Gujarati and Punjabi fonts with Harfbuzz.
  • Parag Nemade added InScript2 keyboards for Sanskrit, Nepali, Marathi and Konkani to jquery.ime.
  • Ankit Gadgil wrote over 200 unit tests for Marathi and Hindi input methods in jquery.ime.
  • Yuvaraj Pandian, Pau Giner, Arun Ganesh and Siebrand Mazeland developed an initial version of an Android-native app for Translatewiki.net for translation reviews.
  • Pau Giner conducted user testing with new translation prototypes with translators. Arun Ganesh created an icon for gnome-transliteration.

You can browse through tweets and more notes from the event. Happy reading!

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

Universal Language Selector now has Input Methods

The Language Engineering team at the Wikimedia Foundation works on a set of tasks every two weeks. This post is about the team’s accomplishment over the past two weeks. 

Have you ever sat at a computer in a foreign country, and wondered how you were going to enter text in your language using a keyboard with a different alphabet?

“Input methods” are interfaces that allow users to enter text in a script different from the one used on their keyboard. On some Wikipedia versions (like wikis in Indic languages), such a tool has been available through the Narayam extension.

As part of Project Milkshake, this feature has recently been exported to a JavaScript library (a bundle of code, called jquery.ime) so that it could be reused by other web developers.

Another language-related tool, the Universal Language Selector (ULS), allows readers of Wikipedia and its sister sites to easily pick the language of their choice for the website’s interface.

Over the last two weeks, we’ve integrated the input methods’ functionality directly into the Universal Language Selector: it now comes with a large set of input tools that users can use to input text in non-latin languages.

The integration of the two tools makes the interface more consistent and usable when it comes to choosing languages in which to read (“display”) and to write (“input”) on the site: both settings are located in the same dialog of the Universal Language Selector.

When selecting a language in which to write, it’s possible to set an accompanying preferred input method for that language, if available. When input methods have been assigned to different writing languages, switching between languages in the menu will automatically change to the preferred input method for that language.

Other language engineering news in brief:

  • The Language Engineering team will be in India during the second week of November to participate in the OpenSource Language Summit in Pune, and the Wikimedia DevCamp in Bangalore. For new volunteers who want to get started contributing to our tools, we’ve prepared a list of bugs that you can work on at these events with our support.
  • We’ve also worked on finalizing the development plan and features for Translate UX improvements, which were identified by user testing with volunteer translators to improve translation efficiency.
  • We’ve worked on how to get metrics on the impact of our tools through URL-based usage data gathering. Feedback is welcome.
  • We’ve fixed some bugs related to the ULS and gender support in MediaWiki and MediaWiki extensions.
  • The Narayam and Webfonts extensions were deployed to Wikimedia sites in Marathi; Narayam was also deployed to sites in Amharic.
  • An early stable version of ULS was deployed on Wikidata; this first use on a production site revealed a few bugs that were fixed. It will be updated to the latest stable version periodically.

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

Designing for the multilingual web

User testing is essential for designing multilingual interfaces, even though it can be a time-consuming process: it ensures that the community of users are part of the design process. In this article, we share lessons learned by the Language Engineering team while designing features and interfaces that empower users to read and edit Wikipedia and its sister sites in many languages.

Designing user interfaces for the Wikimedia world comes with a lot of responsibility. To achieve our mission, we need to make sure we think of users with varying levels of technological expertise and language skills. While the internet can be a very friendly place for those who understand English, it could be like navigating the Greek Wikipedia to the 4.4 billion people who do not.

While designing the user interface (UI) for language tools like the Universal Language Selector (ULS) and Translate extension, we needed to make sure it could be understood and used by those who use the internet in languages other than English. We had to create early representations of the interface, link them together to create interactive prototypes and test them with users. Each of these steps presents various challenges in a multilingual environment.

Design tools generally have poor support for non-Latin scripts. Moreover, creating screens and prototypes in languages that you don’t speak is hard. But since the world needs these language tools, we can’t wait for our design software to improve, we just need to figure out our own ways to get things done.

Creating multilingual mock-ups

UI designers make layout comps early in the process to illustrate how the interface elements and content will be arranged.

While designing the ULS (that will display a massive list of languages), the only way to understand the effectiveness of the layout was to simulate the end result with all the language names. Common graphic design suites are not optimized to manage large number of text elements and have issues when working with non-Latin fonts.

Our workaround: After some exploration, we realized that the most painless way of creating comps that have multilingual text is to render them outside the design software:

  • Create the UI layout using your design tool;
  • Use a template language like mustache to include placeholder text within the mockup and export them as SVGs;
  • Create a translation text file to replace the placeholder text with strings in your language;
  • Perform a string replace in the SVG and rasterize it with inkscape using a script.

There is a neat illustration of the entire process in this video by Pau Giner. This process allowed us to quickly experiment and test comps in many languages by giving the text file to a translator.

Making interactive prototypes

The best way to understand if a design is effective is to observe a user using it. The fastest way to do this is to make click-through prototypes that simulate a workflow. When our multilingual comps were ready, the next task was to package them and link them by hotspots. Most of the popular tools to do this are not free. After scouting around for free alternatives, we chose a Firefox extension called Pencil because it:

Translate extension prototypes in Malayalam

  • imports raster and vector images, including copy-pasting from other design tools;
  • features master pages and a component library to reuse graphic elements;
  • has rich text support;
  • exports to a single HTML which simulates a web page experience;
  • is free and open-source.

It fulfills our requirements, even though there are a few annoying quirks in the interface which could be improved. Check out this interactive presentation of the ULS that was created using Pencil.

Remote user testing

Once our prototypes are ready, it’s finally time for the real test, with users from parts of the world where the primary language is not English (roughly 95 percent of the world’s population). Planning the logistics and schedule of remote user testing can be tricky, so here are a few key points to keep in mind:

Remote user test

  • Create a pool of volunteer user testers early in the design process. Getting a tester is usually a hit and miss, so it helps to have a volunteer base that can be easily reached when needed.
  • Tell users what the test is about. Most users will not know what happens in a user testing session and may be afraid to volunteer. Who likes to be tested anyway? We created this guide to better communicate what the sessions were about.
  • Schedule the tests initially and ask for a confirmation. We found that testers may not be available at the scheduled time and they often want to reschedule. Stay friendly and accommodating, as these people are providing you with valuable feedback.
  • Observe using a platform that meets your requirements. We found Google+ Hangouts to be the service of choice due to its ease of use across operating systems. As a bonus, it can automatically create a YouTube video of the whole session.
  • Inform the tester beforehand on privacy issues. If you want to share the observations publicly, make sure the tester knows and has agreed to those terms.
  • Have fun. Keeping the mood light with initial introductions will help to make the tester feel comfortable and give more feedback.

If you are curious, you can watch this video from our latest test sessions of the ULS to understand how we do it.

We are designing and developing your software, so keep the feedback coming!

Arun Ganesh and Pau Giner
UI/UX Designers, Language Engineering Team

Language Engineering: Input methods and Visual Editor

The Language Engineering team at the Wikimedia Foundation works on a set of tasks every two weeks. This post is about the team’s accomplishment over the past two weeks. You can also check the slides of our demonstration.

jQuery.ime: Wikimedia wikis use Extension:Narayam to support input of non-Latin text. As part of Project Milkshake, jQuery.ime is a generic input method tool ported from Narayam, which can be used even outside the Wikimedia universe. We have completed the development of jQuery.ime and this example demonstrates the plugin in action. It supports over 60 input methods across 32 languages. There is detailed technical specification and we welcome you to try out and contribute to the project by creating new input methods or reporting bugs. The next phase will be to integrate jQuery.ime with Universal Language Selector.

Internationalization requirements for VisualEditor: The VisualEditor will change the Mediawiki editing interface in a major way, making it much more user friendly. The Language Engineering team has a keen interest in making sure the VisualEditor supports all languages. We have written detailed Internationationalization and Bidirectional text requirements for the Visual Editor to support all languages, including right to left languages. Other available documents are a general test document, right to left test and Indic tests for testing input method compatibility with VisualEditor. Do perform these tests for your language and report bugs if you find them.

India Events: The Language Engineering team will be in India in early November participating in the OpenSource Language Summit in Pune and the Wikimedia DevCamp in Bangalore. If you are a developer interested in working on language related tools or Wikimedia Mobile, please sign up for the DevCamp. We will also meet up with community and talk about Language Engineering tools at the Language Engineering meetups in Pune and Bangalore. If you’re near, please sign up and we’ll see you there!

In brief:

  • Universal Language Selector got some bug fixes, including scrolling, choosing fonts, and it is now fully internationalized.
  • As mentioned in the previous blogpost, We have completed integrating Extension:Translate with CentralNotice. Some patch sets are awaiting code review. Unfortunately this feature might be missing in this year’s fundraising translations, due to other fundraising priorities.
  • We held IRC office hours (log) on October 17th. The next session is on November 21st.

Srikanth Lakshmanan

Internationalisation/Localisation Outreach / QA Engineer

Translating Central Notices easily, and other Language engineering news

The Language Engineering team at the Wikimedia Foundation works on a new set of tasks every two weeks. The following describes the work we have done over the past month.

Language Engineering Team: We have renamed our team from “Internationalization & localization” to “Language Engineering”. The terminology we previously used did not illustrate our goals very clearly. We hope that the new name will communicate our goals and activities better.

Translating Central Notices: The process of translating banner texts for CentralNotice (used for Wikimedia banners) will soon be more streamlined,  after the integration with the Translate extension. It is now possible to create message groups for translation from CentralNotice. We are completing a few pending tasks before making this tool available on Wikimedia projects.

Partially translated Tamil Interface of the Universal Language Selector.

Universal Language Selector: The jquery component of Universal Language Selector (ULS) is now internationalized and can be translated. This also fixes the ‘placeholder’ that some of our readers mentioned in comments on our previous posts. ULS also got many bug fixes, including proper input tool support and improved workflow between the settings screens. We are discussing with the Operations team on how to deploy ULS on small Wikimedia wikis without negatively impacting current caching mechanisms for anonymous users.

Project Milkshake: We fixed bugs, fixed tests and addressed review comments related to jquery.i18n, the internationalization library. jquery.uls, a library for language selection, is the latest addition to Milkshake.

San Francisco Meetings: Most of the language engineering team members were in San Francisco in September. We met with other Engineering teams at the Wikimedia Foundation, including the Visual Editor team, and discussed Internationalization requirements with the Mobile team. We talked about Making the Web Multilingual with Wikipedia (video) at San Francisco State University, and presented Project Milkshake at Google, Twitter and change.org.

Other news

  • We held IRC office hours (log) on September 19th, and will be holding the next session on October 17th.
  • We will be in Bangalore in November for the Wikimedia Devcamp.

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

Internationalisation team introduces translation memory and plan for language teams

The Internationalisation/Localisation (i18n/l10n) team at the Wikimedia Foundation works on a set of tasks every two weeks. This post is about the team’s accomplishment over the past two weeks.

You can also watch the 15 minute demonstration and check the slides.

Language Teams
Mediawiki supports over 350 languages. Supporting languages goes far beyond providing a localised interface. It includes ensuring that the overall experience of a language in which the user reads and contributes to Wikipedia (or any wiki running MediaWiki) is the same as it is to an English user. It is impossible to support without the help of volunteers in the respective languages. We have tried filling this gap through Language Support Teams. We are now starting an initiative to form and build strong language teams and have published a detailed plan. Please do let us know what you think about it.

Translation Memory on Wikimedia servers

The Translate extension is currently deployed on Meta-Wiki, mediawiki.org and a few other Wikimedia wikis to facilitate the translation of their content. In order to make translations faster and more consistent, a shared translation memory service has been enabled in the extension, which will provide translation suggestions from past translations of similar text.

Universal Language Selector Update

ULS with Common languages listed

ULS with Common languages based on geolocation and “accept languages” listed

The Universal Language Selector (ULS) has a new top section in the default view. It now lists a set of “Common Languages”. They include languages spoken in your region (determined by geolocation), preferred languages from your browser (“Accept-Language”) and previously used languages. For many users, this should minimise the need to search for their language of choice in the list. The ULS now integrates with MediaWiki preferences, and performance was improved by using lazy loading.

Milkshake Update

jQuery.i18n, the internationalisation messaging framework library, saw the addition of support for gender, plural and grammar. All the language-specific rules that exist in Mediawiki are now available in jQuery.i18n. The framework is also built with scope for adding more extensible attributes in addition to the default ones. It supports lazy loading of translations.

Other accomplishments

  • Translation UX tests continued, and led to improvemenents of the design prototypes for the translate extension, including new translator captchas. You can watch a short video about the project.
  • We investigated solutions to support Indic languages on mobile platforms.
  • We organized a bug triage on i18n issues.
  • Apart from the usual bug fixes, a lot of Right-to-Left bugs were also fixed in the experimental PageTriage and ArticleFeedback v5 extensions, as well as in the veteran WikiHiero extension and core MediaWiki.

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

WebFonts in Universal Language Selector, Translation rally

The Internationalisation/Localisation (i18n/l10n) team at the Wikimedia Foundation works on a set of tasks every two weeks. This post is about the team’s accomplishment over the past two weeks.

The Universal Language Selector(ULS) is designed to not only change the user interface language, but also to help the user set other language settings. ULS will contain input method and font settings along with it. The current version of ULS includes full integration with WebFonts and is now set up for testing at http://translatewiki.net. Let us know about bugs you find by reporting at Bugzilla. ULS now uses the Project Milkshake component jquery.webfonts. This means that when ULS is deployed, the WebFonts extension will be deprecated.

Language Settings in the ULS

Display settings dialog with option to choose font and to change the user interface language.

Other accomplishments

  • Published a detailed plan for collecting metrics and impact measurement criteria for our work. Please provide us feedback after reviewing these criteria.
  • Conducted a translation rally at translatewiki.net. According to preliminary results 180 translators contributed to 65,000+ translations on different Wikimedia related projects in 115 languages. 54 users will share the bounty sponsored by Wikimedia Nederlands, the Dutch Wikimedia chapter.
  • Improved prototypes for Translate workflow user experience and completed user tests. You can check out test results. If you are a translator and wish to help us test the prototypes, please sign up.
  • Met with the Wikimedia Deutschland Wikidata team about moving ahead with ULS for Wikidata language selection.
  • The Georgian alphabet is now supported in Extension:Narayam.
  • All Marathi wiki projects now have input method to Narayam deployed.
  • Tim Starling wrote a php parser for the CLDR plural rule definitions. Some time back we had written a JavaScript parser for this. Soon the plural support in MediaWiki localisation messages will be based on CLDR plural definitions.

Please remember to join in at our sprint demo every 2 weeks to keep up with our latest work. The demos are every other Tuesday at 15:00 UTC. For those who could not attend, you can watch the latest demo  as well as all our old demonstrations are now available  in commons.

Srikanth Lakshmanan, Internationalisation/Localisation Outreach / QA Engineer

Internationalisation team updates on Universal Language Selector and Project Milkshake

The Internationalisation/Localisation (i18n/l10n) team at the Wikimedia Foundation works on a set of tasks every two weeks. The following describes the work we’ve done over the past month.

We’ve worked on making the Universal Language Selector (ULS) functional according to the design prototypes, and it is now available for alpha testing on http://translatewiki.net. Please help us test it in your language. The features listed below are the highlights from the development work done on the ULS:

  • Supporting approximate search, which will give results even with typos in it.
  • Ability to do language search using any language: Search using the translation of language names in other language names.
  • Auto-complete for search across languages.
  • Writing systems that belong to the same family are visually grouped together.

Languages are grouped by writing system.

Fuzzy search: Searching 'tail' returns 'Thai' and 'Tamil'.

Another focus of our work has been Project Milkshake, an effort to release our existing internationalisation-related JavaScript components as standard jQuery libraries. These libraries —dually licensed under GPL and MIT license— will not only help us in integrating our tools better with the Universal Language Selector, but will also let other people reuse them widely in other web projects, and get the benefits of internationalisation components just by using these libraries:

  • jquery.i18n — A library to provide full i18n framework that supports parameter replacements and grammar-, plural-, and gender-dependent translations;
  • jquery.webfonts — A library to provide webfonts support, developed from WebFonts extension;
  • jquery.ime —A library to provide input methods in the browser, developed from Narayam extension;
  • jquery.uls will also be available soon as we make more progress on ULS.

We continued to improve translation user experience by making the screens more user-friendly and making the process more efficient for translators. We are currently working on the prototypes. This aims to increase the number of translators, as well as provide an interface that helps them spend their time as efficiently as possible. In the coming weeks, we will perform in-depth usability tests with several members of the community. You can learn more about them at the Translation UX page and, if you are interested, consider volunteering to participate in the usability tests.

As usual, apart from these, we continued fixing issues that were reported in Bugzilla, as well as translation-related issues in translatewiki.net. Narayam, the input tool, was deployed in Bengali Wikisource. If you need Narayam / WebFonts enabled on your wiki, please open a bug in bugzilla.

In the coming weeks, we will be working on integrating Narayam and WebFonts as part of language tools in Universal Language Selector, completing the translation UX improvements.

We will also be having our monthly online office hours on August 15 16:30 UTC (8:30 PDT). This is an opportunity to ask the development team questions about current and upcoming i18n features. The team will also share updates on exciting work happening on the ULS, translation workflow enhancements and additional language support on Narayam and Webfonts.

Please feel free to contact us through the mediawiki-i18n mailing list.

Srikanth Lakshmanan, Internationalisation/Localisation Outreach / QA Engineer