The projects in the Wikimedia universe can be accessed and used in a large number of languages from around the world. The Wikimedia websites, their MediaWiki software (bot core and extensions) and their growing content benefit from standards-driven internationalization and localization engineering that makes the sites easy to use in every language across diverse platforms, both desktop and and mobile.

However, a wide disparity exists in the numbers of articles across language wikis. The article count across Wikipedias in different languages is an often cited example. As the Wikimedia Foundation focuses on the larger mission of enabling editor engagement around the globe, the Wikimedia Language Engineering team has been working on a content translation tool that can greatly facilitate the process of article creation by new editors.

About the Tool


The Content Translation editor displaying a translation of the article for Aeroplane from Spanish to Catalan.

Particularly aimed at users fluent in two or more languages, the Content Translation tool has been in development since the beginning of 2014. It will provide a combination of editing and translation tools that can be used by multilingual users to bootstrap articles in a new language by translating an existing article from another language. The Content Translation tool has been designed to address basic templates, references and links found in Wikipedia articles.

Development of this tool has involved significant research and evaluation by the engineering team to handle elements like sentence segmentation, machine translation, rich-text editing, user interface design and scalable backend architecture. The first milestone for the tool’s rollout this month includes a comprehensive editor, limited capabilities in areas of machine translation, link and reference adaptation and dictionary support.

Why Spanish and Catalan as the first language pair?

Presently deployed at http://es.wikipedia.beta.wmflabs.org/wiki/Especial:ContentTranslation, the tool is open for wider testing and user feedback. Users will have to create an account on this wiki and log in to use the tool. For the current release, machine translation can only be used to translate articles between Spanish and Catalan. This language pair was chosen for their linguistic similarity as well as availability of well-supported language aids like dictionaries and machine translation. Driven by a passionate community of contributors, the Catalan Wikipedia is an ideal medium sized project for testing and feedback. We also hope to enhance the aided translation capabilities of the tool by generating parallel corpora of text from within the tool.

To view Content Translation in action, please follow the link to this instance and make the following selections:

  • article name – the article you would like to translate
  • source language – the language in which the article you wish to translate exists (restricted to Spanish at this moment)
  • target language – the language in which you would like to translate the article (restricted to Catalan at this moment)

This will lead you to the editing interface where you can provide a title for the page, translate the different sections of the article and then publish the page in your user namespace in the same wiki. This newly created page will have to be copied over to the Wikipedia in the target language that you had earlier selected.

Users in languages other than Spanish and Catalan can also view the functionality of the tool by making a few tweaks.

We care about your feedback

Please provide us your feedback on this page on the Catalan Wikipedia or at this topic on the project’s talk page. We will attempt to respond as soon as possible based on criticality of issues surfaced.

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering, Wikimedia Foundation