Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts by Amir E. Aharoni

First Wikimedia hackathon in Tel Aviv, Israel

This post is available in 2 languages:
עברית 7%English 100%

English

On Thursday, 23 May, just one day before the big Wikimedia hackathon in Amsterdam, Wikimedia Israel held its first hackathon in Tel-Aviv.

Hackathon TLV 2013 - (31).jpg

Israel has a thriving software industry, as well as a healthy Wikipedia editing community. Despite this, there are relatively few software developers in Israel who work on Wikimedia-related projects, so the primary purpose of this event was to show new people who are skilled in programming and web design how they can contribute their talents to our free knowledge projects.

Wikimedia Israel already organized a hackathon as part of the Wikimania 2011 conference, which was held in Haifa, but this was the first time that such an event was produced in Israel independently of other events.

Google Israel kindly gave us the venue – the hacking space in their Tel-Aviv Campus building, which is perfect for such events: cozy, simple, with comfortable tables, a lot of power strips and good wifi. About thirty people showed up for the event. Their skills were varied and quite surprising. There were not just PHP and JavaScript developers – these languages being the most important in MediaWiki – but also experts in DevOps, integration testing, Python scripting, data visualizations and design.

Hackathon TLV 2013 - (64).jpg

In the best hackathon style, the event focused less on talks and more on code, but I was very happy to host one guest talk by Mushon Zer-Aviv, a developer of the freely licensed Alef font, designed as a modern Hebrew and Latin typeface for the web.

So, most importantly, what did the event accomplish? Among other things: fixes for two MediaWiki bugs, both made by new developers; improved automatic tests for JavaScript components; a prototype for a script that enriches Wikipedia with data from Open Knesset, a database of information about the Israeli parliament based on open-source technology; and a new template in Lua, also made by a developer who is completely new to the language. I had the feeling that most of the participants became genuinely interested in joining the community of MediaWiki developers.

I want to use this opportunity to give my very sincere thanks to the people who helped me organize the event: Chen Davidi, Itzik Edri and Dorit Shafir-Diamant, who were instrumental in organizing the event’s logistics; Michal from Google Israel for providing the venue; and also to Yair Talmor, Chezi Reshef, Yael Meron, Elad Alfassa, Oren Held, Moshe Nachmias and Yair Podemasky, who very kindly volunteered to help with setting up the venue, handled the registration and cleaned up at the end of the day.

The event was very satisfying, and we hope to have another one soon!

Amir E. Aharoni, Wikimedia Israel

Language Engineering: Progress With Input Methods and Translation Editor

Batti il ferro finché è caldo —an Italian proverb

In its last two-week sprint, the Wikimedia Language Engineering team worked with developers from other teams to improve its keyboard support and we continued working on the new user interface for the Translate extension.

Input methods: More languages and support for mobile devices added

jquery.ime, Wikimedia’s portable keyboard layouts library got boosts from two sources during the last sprint.

Yuvi Panda from the mobile team refactored Wikimedia’s keyboard layouts library, jquery.ime, to make it usable on mobile phones. Now, over 60 keyboard layouts that are supported by IME will also be usable on Android mobile phones. If you’d like to try an early testing version of the mobile keyboard layouts and help developing them, head to the mobile keyboard layouts GitHub repository.

Engineers from Red Hat also joined the input method development effort and added new and improved layouts for the Gujarati, Punjabi, Tamil, Malayalam, Kannada and Telugu languages, spoken by millions of people in India.

If a keyboard layout for your language is missing, you can send a pull request to the main jquery.ime repository.

Progress on translation editor user experience

The team continued fixing and improving the new translation editor, getting it ready to release. Some of the recent improvements include:

  • The most relevant translation memory suggestions are shown at the top.
  • Messages are loaded automatically when the user scrolls to the bottom of the page.
  • The status bar at the bottom of the page shows information about the status of the translations.
  • Recently translated projects are now displayed correctly in the project selector.
  • Discouraged translation projects are omitted from the group selector.
  • Message documentation can now be edited inside in the translation editor.

A video showing some of the recently deployed features in action: Most relevant translation suggesions are shown first; Inline translation documentation editor; Automatic loading of messages when scrolling; Experimental faceted search page for translations.

The features that were already implemented, were also tested with real users by the team’s interaction designer Pau Giner. The issues that the users reported were noted and will be fixed in the coming sprints.

Niklas Laxström implemented Faceted search for translations using the Free Apache Solr engine and deployed an experimental version of the translation search on the testing site. He also made an open presentation about Solr and its upcoming use in translatewiki.net. You can watch Niklas’ presentation about Solr on YouTube.

Next week some of the team members are going to participate in the FOSDEM conference, and after that—the 2nd Language Summit in the Red Hat offices in Pune.

Amir Aharoni. Software Engineer (Internationalization), Language Engineering team

A more efficient translation interface

De mica en mica s’omple la pica i de gota en gota s’omple la bota. —a Catalan proverb

During its most recent development sprint, the Wikimedia Foundation’s Language Engineering team continued to improve the user experience of the Translate extension to make it as smooth and efficient as possible. Highlights include:

  • Pressing the “Save” button immediately shows the next string to translate, while the saving is performed in the background.
  • When progressing to the next translation, the page smoothly scrolls up.
  • Explanations about translatable strings are shown beside the corresponding message in a convenient box, which becomes expandable if the documentation is too long.
  • Machine Translation was made available for suggested translations.
  • The differences between older versions of translatable strings are also shown in a new expandable box.
  • The Language Selector API was updated to allow displaying all the documentation strings.
  • The Solr search engine schema was tweaked to make searching translatable strings more efficient and feature-rich by offering faceted search.

Below is a brief demo of the latest features of the translation editor in action. You can see translating the Etherpad Lite project into Russian there.

The Language team also continues to work on squashing bugs and adding prioritized features. You can check out the latest bleeding edge version of the translation editor on translatewiki.net, or go back to the stable translation editor. Please report Translate bugs in Bugzilla.

Amir E. Aharoni, Software Engineer (Internationalization)

Translation editor growing snazzier

(Emor me’at ve-ase harbe. —a Hebrew proverb)

The Wikimedia Foundation’s Language Engineering team is continuing the makeover of the Translate extension, which started taking shape in early December. (Introduced in 2011, this MediaWiki feature powers the translation of Wikipedia’s software, announcements, reports and fundraising banners, and of other sites and software projects.)

During its latest two-week sprint, the team improved the actual interface used for submitting and editing translations:

A screenshot of the new work-in-progress translation editor

Information about the message and translations to other languages are now shown in a collapsible box on the right side of the translation area. Warnings about potential errors in the message are shown in a small box above the editing area, which is expandable, too.

The functionality for saving and skipping messages was updated. Usability testing observations by Arun Ganesh and Pau Giner suggest that users facing a hard part in a translation are more likely to just skip it than to report the problem. Because of this, skipping a message is now recorded and frequently skipped messages will be considered for re-wording.

In the next sprint the team will work on polishing the translation interface further: better display of documentation, translation suggestions and diffs, better responsiveness, more robust language selection and other features.

In other Language Engineering news:

  • The December 2012 version of the MediaWiki Language Extension Bundle was released.
  • Better support for language variants and alternative language codes was added to the Universal Language Selector.

Amir E. Aharoni, Software Engineer (Internationalization)

Translation interface makeover in progress

Ei kannata mennä merta edemmäs kalaan. —a Finnish proverb

The Translate extension, a central piece in the puzzle that makes Wikipedia and the community around it massively multilingual, is getting a major overhaul.

“Translate”, as it’s commonly called, powers the translation of Wikipedia’s software, announcements, reports and fundraising banners, and of other sites and software projects. It focuses on making the translators’ work easy, efficient and, if possible, fun. The software gets frequent under-the-hood updates, and now the time has come for a major overhaul of its most visible part: the translation user interface.

Arun Ganesh and Pau Giner, from the Wikimedia Foundation’s Language Engineering team, have studied the current translation workflow by testing the software and interviewing translators. They drafted new interface ideas and tested experimental designs with users who speak different languages and have different levels of experience with the translation functionality.

In the team’s thirtieth two-week coding sprint, which ended last Tuesday, two major components of the overhaul have taken shape: the message group selector and the list of translatable messages.

The Message group selector. Message groups are groups of related translatable messages: a software project, a multilingual blog post or announcement, etc.

The group selector helps a translator find a project to translate using a tree-like structure of groups and sub-groups. Every project shows the completeness of the translation using a colorful progress bar. For quick and easy access to the projects that interest the translator, there’s a tab that shows recently used projects, and a responsive search function.

Listing of translatable interface message for the Visual Editor. Some messages are translated to French and some need review.

The list of strings to translate has been redesigned to improve clarity, making it easier to scan and distinguish between messages that are translated, untranslated and need to be updated (“fuzzy”).

The development of the improved user experience continues. In the next sprints, the team will complete these features and add new ones, such as an improved sign-up process and better search. Usability testing efforts will continue to ensure that the new designs provide an improved experience. If you are interested in trying the new translation tools, please volunteer for our usability testing sessions.

Other ways to connect with the Language Engineering team:

  • Pau Giner and I will present on multilingual user testing and internationalization “dos and don’ts” in the live broadcast Wikimedia Open Tech Chat on Thursday, December 13 at 20:30 UTC.
  • We’ll hold IRC Office Hours on Wednesday, December 17 at 17:30 UTC. Topics of discussion will be the translation user experience improvements, Universal Language Selector and general Q&A.

Amir E. Aharoni, Software Engineer (Internationalization)

Language engineering news: Bugs fixed in Universal Language Selector, and a new IPA keyboard layout

Imagine a world in which every single human being can easily select the language of the website that they are reading.

One of the bugs that were fixed: not all elements of the user interface of the Universal Language Selector’s were using web fonts.

That’s what the Wikimedia Foundation’s Language Engineering team has been working on through the Universal Language Selector (ULS): a reusable user interface component for comfortable selection of the most appropriate language out of a long list of available options. It integrates new features from Project Milkshake, a set of portable JavaScript tools for internationalizing any web application with web fonts, keyboard layouts and a robust mechanism for loading translations.

The Universal Language Selector is already used on translatewiki.net and on the new Wikidata project, two massively multilingual communities of software translators and data curators, who are testing this feature in an actual production environment, and reporting many bugs. After coming back from the Bangalore Developer Camp, the team set out to fix the last major bugs in the ULS, and most notably:

Now, all buttons use web fonts and are readable.

Currently, the Universal Language Selector supports 68 keyboard layouts and 44 web fonts, and the number is growing. New fonts and keyboards are added according to the needs of the readers and the editors’ communities around the world.

In other news:

  • We held Language Engineering office hours on November 21.
  • Web fonts support was deployed to the Persian Wikipedia, but unfortunately reverted after the users found several issues with font rendering. The team hopes to fix the problems and deploy web fonts again, for the benefit of all the users who do not have good fonts installed on their computers and devices.
  • Niklas Laxström created the first test release of the MediaWiki Language Extension Bundle, an easy-to-install package of stable versions of several MediaWiki extensions that improve its multilingual support. It keeps your MediaWiki site’s interface translations up-to-date and includes “language skills” boxes, rich locale data, easy translation of content pages and site interface, and the aforementioned UniversalLanguageSelector, which helps users select the language.
  • A screenshot of MediaWiki with jquery.ime and the word ‘milkshake’ written in IPA.

    I created a keyboard mapping for easy typing in the International Phonetic Alphabet (IPA), based on the SIL IPA layout. The IPA is very commonly used as a pronunciation guide in Wikipedia and Wiktionary, and the deployment of the Universal Language Selector will make typing in IPA easier. Other IPA layouts may be easily added, for example X-SAMPA. You are very welcome to try this layout in translatewiki.net: click any text field, and select the English language and the SIL IPA layout in the keyboard layout pop-up.

The team’s next sprint marks the beginning of a new release, during which we’ll start implementing a major overhaul of the user interface of translatewiki.net.

Amir E. Aharoni, Software Engineer (Internationalization)

Wikipedia Engineering DevCamp sees a lot of energy and contributions in Bangalore

On November 9-11, the Wikimedia Foundation held a developer meetup in Bangalore, India

On November 9-11, the Wikimedia Foundation held a developer meetup in Bangalore, India. The gathering provided an opportunity for India-based developers to work with the Foundation’s engineering teams on several projects, such as JavaScript-based language engineering tools, and mobile applications with PhoneGap and LAMP technologies.

The DevCamp focused on Language Engineering, Mobile development and User interaction and experience design (UI/UX). It was attended by more than 85 developers, UX/UI designers, Wikimedians and translators. The work sessions focused on developing various Wikimedia mobile apps as well as language tools. The first day of the DevCamp kicked off on Friday with tutorials on Developing mobile applications with PhoneGap by Brion Vibber and How to internationalize your code by myself. Interactive Q&A after the sessions concluded the day with a lot of challenging and interesting questions after both tutorials.

The second day started off with Santhosh Thottingal introducing Project Milkshake (the team’s JavaScript-based internationalization libraries) and the Universal Language Selector currently under development. The mobile team introduced various mobile projects like native mobile apps, mobile front-end, and VUMI-based feature phone apps that powers Wikipedia Zero. Interaction designer Pau Giner introduced design projects and guided new contributors. People started selecting projects they were interested in and teamed up with Wikimedia engineers. It was exciting to see some contributors make their first-ever open-source commits during the DevCamp. People continued to hack throughout the two days.

The final day of the DevCamp started with stand-up updates from all participants, and ended with demos and presentations of 18 projects by 25 presenters. One of the most lovely updates was presented by Lakshmi, who learned to type in her language, Malayalam, using the typing tools that Wikimedia engineers have developed.

A screenshot of a mathematical formula rendered using the MathJax library, with a context menu in the Tamil language.

Accomplishments at the DevCamp include contributions to language engineering projects, where contributors added unit tests to jquery.ime (the input method library for multiple language scripts), submitted bug fixes, tested and actively reported bugs on jquery.ime and the Universal Language Selector. Another highlight was Brion Vibber’s integration of Universal Language Selector, WebFonts and support for language variants to the Wikipedia mobile app. One of the contributors, Ershad, built a Google Chrome extension based on the input method jquery.ime and won a Wikimedia shoulder bag for it. Other highlights include patches submitted to MathJax (a library used to render mathematical equations on HTML pages) by Aditya Ravi Shankar and myself to add internationalization support.


On the mobile platform, Swayam made enhancements to the Translate proofreading mobile app. Other mobile apps developed at the DevCamp include a Commons uploader and an app to track recent changes. Patches were also submitted to MobileFrontend, an iOS client library, and a first working version of the Wikipedia FirefoxOS app.

On the UI/UX design projects, participants worked on ideas for redesigning the translatewiki.net home page, the Mobile Universal language selector, Commons discovery and triaging apps. Here’s a complete list of demonstrations that were made at the Bangalore DevCamp; you are welcome to join the coding fun!

All in all the DevCamp maintained a high energy level throughout the three days, as well as produced a lot of new code, bug fixes, input keymaps, unit tests, mobile apps, translation UI and mobile designs, and positive collaboration across the board.

Amir E. Aharoni, Software Engineer (Internationalization)

Group photo on the lawn of the IIM Bangalore.

Writing Malayalam on Wikipedia, just like with pen and paper

Lakshmi Valsalakumari is an IT professional who wants to expand her horizons. She attended the recent Wikimedia Developers Camp in Bangalore and had this story to tell:

A man and a woman working together at a laptop computer

Lakshmi with Santhosh Thottingal, the lead developer of Wikimedia’s font and keyboard tools

I have been an Information Technology professional working with well-known software organizations over the last 15 years. While IT has been keeping me busy, productive and happy, I have also all along harbored an interest in history and the humanities. I have recently decided to pursue these interests full-time, joining a research program at the Centre of Exact Humanities, International Institute of Information Technology, Hyderabad, India.

With my recent shift into academics and research, I have been referencing Wikipedia quite a bit in the last two to three months, and I have been amazed at the sheer magnitude of information found on it. While I have been reading the Wikipedia pages extensively, I had never yet considered editing it, not even in English, the language I reference Wikipedia most in, and the one I use most on computers.

Editing and contributing content in Malayalam, my mother tongue, had not really occurred to me either—Malayalam being a language I hardly used on my computers—until I attended the Bangalore Wikimedia Dev Camp.

I have tried typing Malayalam using my regular browser, but I have not been very happy with the effect. This was not the way I liked to see Malayalam written and rendered, so I had not made any further efforts to write Malayalam online. At the camp, I met Santhosh and Manoj—avid Malayalam Wikipedia contributors—and they persuaded me to give it another shot.

The first step was to download the Meera Unicode font for Malayalam, then to change my default browser to one of those that can render Meera well (I tried out Google Chrome; Firefox was even better, I was told), and then to try out typing Malayalam using the regular English keyboard.

I liked what I saw. When I typed the suggested key combinations, even complicated Malayalam letter combinations were being rendered the way I would have written them using pen and paper. I tried more and more combinations—ta, tha, tta, Ta, tma, thra, tya, zha—and was pleased with the effect. This was fun!

The words "Catalonia" and "Lakshmi" typed in Latin transliteration and in Malayalam letters

Demos of how transliteration keyboards for Malayalam work

Soon, I was creating my first article. I noticed that on the main Wikipedia page, an article on Barcelona mentioned Catalonia as a red link, meaning that no further information was available in the Malayalam Wikipedia on it, whereas there was plenty of information on the same subject in the English Wikipedia. Manoj guided me through the steps as I created my first page in the Malayalam Wikipedia, copied the template information over from the English article and saved the heading, trying to get it right in Malayalam. I viewed my saved efforts, and with a sense of achievement, I went to grab a coffee.

Back online with my coffee, I was surprised to find a message on the article Talk page—someone had already posted a comment on the page I had just saved, chiding me for the lack of content and references. “This will drive away people from Wikipedia,” the post read. “Please ensure I get enough content on the page!”

Man, that was fast! I had no idea people were watching and following Wikipedia edits this closely. Manoj encouraged me to type more, so I returned to my effort. While I was getting comfortable with the typing, I was still grappling for suitable words in Malayalam for the content I was reading in English. Manoj suggested Olam, an online dictionary, and sure enough, I was able to find several of the Malayalam equivalents I was searching for.

And so, I typed on. Again, to my surprise, I found people editing the content and giving helpful suggestions even as I was still typing—one person told me to leave native names as such and not translate those, and another formatted some of the changes. By the end of the day, I had posted a decent amount of info, although there remained much more to be added.

I was happy with my day’s work. I had never imagined that using Malayalam on my computer and editing the Malyalam Wikipedia content would be such a pleasant and enjoyable experience, one that I was actually looking forward to!

Another point I must mention here is the sheer volume of Malayalam content that I have started seeing online, on Wikipedia pages and elsewhere. This must be due to the attention paid to this field of languages, literature and culture online by movements like Wikimedia. In 2005, I remember searching online for a well-known Malayalam lullaby Omanathingalkkidavo by Irayimman Thampi, but could not find anything. I had then resorted to the memories of my immediate relatives to try and pen the forgotten lyrics. Now, when I search for the same, the amount of material that comes up on that lullaby is amazing!

My heart-felt appreciation to Wikipedia and all its online community members who have made all of this possible. I hope to be part of this movement myself and do my bit toward furthering easy availability of multi-lingual content online

Lakshmi Valsalakumari


The Wikimedia Language Engineering team is developing technologies that make it possible to speakers of all languages to contribute to Wikipedia in their language as easily and naturally as possible. Lakshmi’s story is an example of how these technologies enable people to develop reference and educational content that makes Wikipedia useful to people in the whole world. These technologies are deployed in Wikipedias in most languages of India, and more languages and projects are being added all the time.

Amir E. Aharoni, Software Engineer (Internationalization)

Translate Wikidata’s user interface and open it to the world

Wikidata is one of the most important and exciting innovations in the world around Wikipedia. To make it accessible to a wide range of users, it needs its user interface to be translated to as many languages as possible, and you can help.

At the first stage, already partly enabled, Wikidata stores “interwiki links”, i.e. page metadata that connect articles about a same topic on different language versions of Wikipedia. Historically, these interwiki links have been duplicated and stored in each of the pages they linked together. With Wikidata, the list of pages about a same topic is centralized.

The next goal of Wikidata is to store not only page metadata like interwiki links, but also common data that is repeated in all languages, such as census data for cities and dates of birth and death of famous authors.

Practically all the projects that are related to Wikipedia are massively multilingual, but Wikidata is especially so: it stores common data with the goal of displaying it efficiently in all languages.

The very useful and famous CIA World Factbook site has tables of data about all countries in the world, but the labels are only written in English. Now imagine a site with such tables, but with the ability to display the labels in any language and not just English: that’s what Wikidata aims to become.

In the near future, the translation of such table labels will be done on the Wikidata website itself. In the meantime, you can help by translating the user interface displayed by the software running Wikidata.

Translation of the Wikidata software is done on translatewiki.net, the same translation platform used to translate Wikipedia’s interface. Wikidata relies on three main components that need translating: Wikibase – Repo, Wikibase – Client and Wikibase – Lib.

Wikipedia made encyclopedic articles open and accessible; Wikidata is about to do the same to statistics and other structured information. To ensure that people speaking your language can benefit from the immense potential of Wikidata, and contribute to its success,  please join us today and help us translate it.

Thank you!

Amir Aharoni
Software Engineer (Internationalization)