Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘input methods’

Report from the Spring 2013 Open Source Language Summit

Fortuna i forti aiuta, e i timidi rifiuta — an Italian proverb

The Wikimedia Foundation and Red Hat jointly organized the Second Open Source Language Summit on February 12th and 13th, 2013. The summit was held at the Red Hat engineering center in Pune, India. Similar to the previous summit, this face-to-face work session was focused on internationalization (i18n) and localization (l10n) features, font support, input method tools, language search, i18n testing methods and standards. The sessions were work sprints, each with special focus on a key area. Participants included core contributors from the Wikimedia Foundation, Red Hat (including Fedora SIG members), KDE, FUEL, Google and C-DAC. Below is a summary of what was accomplished during these two days.

During the summit, teams from different organizations came together to discuss language-related challenges, and worked together on features and tools to address them.

During the summit, teams from different organizations came together to discuss language-related challenges, and worked together on features and tools to address them.

Input Methods

Parag Nemade and Santhosh Thottingal worked on making additional input methods available for the jQuery.IME library. 60 input methods, covering languages like Assamese, Esperanto, Russian, Greek, Hebrew were added bringing the total to 144. Also IMEs from the m17n library missing from the jQuery.IME library were identified.

Translation tools, translatewiki.net & FUEL Sprint

Siebrand Mazeland and Niklas Laxström, together with Ankit Patel, Rajesh Ranjan and Red Hat language maintainers, worked to identify more tools that could be used as Translation aids in a translation system. The FUEL project aims to standardize translations for frequently used terms, translation style and assessment methodology. Until now it has focused mostly on languages of India. The FUEL project can now be translated in translatewiki.net. Pau Giner demonstrated new designs for the translation editor and terminology usage, remotely from Spain.

Language Coverage Matrix

To better evaluate the needs for enabling support for languages, a matrix detailing the requirements and availability of basic and extended features is being drawn up. With 285 languages currently supported in Wikimedia and more than 100 in Fedora, this document will be instrumental in bridging the gaps and porting features across projects and platforms. Key areas of evaluation include input methods, fonts, translation aids like glossaries and spell-checkers, testing and validation methods, etc. A preliminary draft was created during the summit by Alolita Sharma, Runa Bhattacharjee and Amir E. Aharoni.

Fonts, WebFonts

An initiative to document the technical aspects of fonts for scripts for languages spoken in India started during the language summit. For each of the scripts, a reference font will be chosen and each font will be explained in detail to intersect with the Open Type font specification as a standard. It will aim to act as a reference document for any typographer working on Indian language fonts. Initial draft and outline of this document was prepared during the second day of the language summit, mainly by Santhosh Thottingal and Pravin Satpute.

Testing Internationalization Tools

Finding suitable methods for testing internationalized components and contents was the major focus of this sprint, with the Fedora Localization Testing Group (FLTG) and Wikimedia’s Language Engineering team sharing details of their testing methods. The FLTG conducts Test Days prior to Fedora beta releases with a test matrix targeted at specific core components, and Wikimedia uses unit tests for frequent testing of their development features. The FLTG showed its plans to integrate the screenshot comparison method for testing localized interfaces. This method will be useful for Wikimedia too. Extending the method for web-based applications and Wikimedia’s language requirements (e.g. right-to-left) were identified as areas for collaboration.

More news from the Language Summit can be found in the tweets, the session notes and the full report.

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering

OpenSource Language Summit

The Wikimedia Foundation and Red Hat co-organized an Open Source Language Summit in Pune, India on November 6-7, 2012. The summit focused on language tools and technology development to support languages on Wikipedia, the Web, Linux and other Open Source platforms.

Santhosh Thottingal presenting his talk on jquery.ime

In total, 45 core language technology developers, open source contributors, typographers and technology evangelists from the Wikimedia Language Engineering and Mobile teams, Red Hat, Mozilla Foundation, KDE, GNOME, translatewiki.net and other open source projects participated in sessions and work sprints on internationalization and localization features supporting various open source projects on the web and Linux. After brief introductory talks, we focused our work on font support, input method tools, language search, and web and localisation standards.

Highlights: 

The event had short talks on the following topics:

Selected achievements

The following people won prizes for their code contributions during the event:

  • Anish Patil ported Universal Language Selector’s cross-language search algorithm to gnome language search
  • Aravinda VK wrote a set of font-forge python wrappers to make changes to fonts programmatically. Aravinda fixed a few bugs in Kannada Gubbi font for Harfbuzz rendering engine and also wrote Kannada KGP keymap for jquery.ime
  • G Karunakar added Hindi inscript keyboard layout to Firefox OS GAIA

Other accomplishments included:

  • Kushal Das added patches to deploy Universal Language Selector on http://www.mozilla.org and also a patch for a bug on Mozilla localization platform.
  • Alolita, Sankarshan, Runa, Satish worked on discussing APIs for various translation workflows and putting together an initial specification.
  • Rajeesh Nambiar, Hussain KH, Ani Peter, Praveen A and Pravin Satpute fixed and filed upstream bugs for Malayalam, Kannada, Gujarati and Punjabi fonts with Harfbuzz.
  • Parag Nemade added InScript2 keyboards for Sanskrit, Nepali, Marathi and Konkani to jquery.ime.
  • Ankit Gadgil wrote over 200 unit tests for Marathi and Hindi input methods in jquery.ime.
  • Yuvaraj Pandian, Pau Giner, Arun Ganesh and Siebrand Mazeland developed an initial version of an Android-native app for Translatewiki.net for translation reviews.
  • Pau Giner conducted user testing with new translation prototypes with translators. Arun Ganesh created an icon for gnome-transliteration.

You can browse through tweets and more notes from the event. Happy reading!

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

Universal Language Selector now has Input Methods

The Language Engineering team at the Wikimedia Foundation works on a set of tasks every two weeks. This post is about the team’s accomplishment over the past two weeks. 

Have you ever sat at a computer in a foreign country, and wondered how you were going to enter text in your language using a keyboard with a different alphabet?

“Input methods” are interfaces that allow users to enter text in a script different from the one used on their keyboard. On some Wikipedia versions (like wikis in Indic languages), such a tool has been available through the Narayam extension.

As part of Project Milkshake, this feature has recently been exported to a JavaScript library (a bundle of code, called jquery.ime) so that it could be reused by other web developers.

Another language-related tool, the Universal Language Selector (ULS), allows readers of Wikipedia and its sister sites to easily pick the language of their choice for the website’s interface.

Over the last two weeks, we’ve integrated the input methods’ functionality directly into the Universal Language Selector: it now comes with a large set of input tools that users can use to input text in non-latin languages.

The integration of the two tools makes the interface more consistent and usable when it comes to choosing languages in which to read (“display”) and to write (“input”) on the site: both settings are located in the same dialog of the Universal Language Selector.

When selecting a language in which to write, it’s possible to set an accompanying preferred input method for that language, if available. When input methods have been assigned to different writing languages, switching between languages in the menu will automatically change to the preferred input method for that language.

Other language engineering news in brief:

  • The Language Engineering team will be in India during the second week of November to participate in the OpenSource Language Summit in Pune, and the Wikimedia DevCamp in Bangalore. For new volunteers who want to get started contributing to our tools, we’ve prepared a list of bugs that you can work on at these events with our support.
  • We’ve also worked on finalizing the development plan and features for Translate UX improvements, which were identified by user testing with volunteer translators to improve translation efficiency.
  • We’ve worked on how to get metrics on the impact of our tools through URL-based usage data gathering. Feedback is welcome.
  • We’ve fixed some bugs related to the ULS and gender support in MediaWiki and MediaWiki extensions.
  • The Narayam and Webfonts extensions were deployed to Wikimedia sites in Marathi; Narayam was also deployed to sites in Amharic.
  • An early stable version of ULS was deployed on Wikidata; this first use on a production site revealed a few bugs that were fixed. It will be updated to the latest stable version periodically.

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer