On July 17, 2014, the Wikimedia Language Engineering team announced the deployment of the ContentTranslation extension in Wikimedia Labs. This first deployment was targeted primarily for translation from Spanish to Catalan. Since then, users have expressed generally positive feedback about the tool. Most of the initial discussion took place in the Village pump (Taverna) of the Catalan Wikipedia. Later, we had the opportunity to showcase the tool to a wider audience at Wikimania in London.

Initial response

In the first 2 weeks, 29 articles were created using the Content Translation tool and published in the Catalan Wikipedia. Article topics were diverse, ranging from places in Malta, to companies in Italy, a river, a monastery, a political manifesto, and a prisoner of war. As the Content Translation tool is also being used for testing by the developers and other volunteers, the full list of articles that make it to a Wikipedia is regularly updated. The Language Engineering team also started addressing some of the bugs that were encountered, such as issues with paragraph alignment and stability of the machine translation controller.

The number of articles published using Content Translation has now crossed over 100 and its usage has not been only limited to Catalan Wikipedia. Users have been creating articles in other languages like Gujarati and Malayalam, although machine translation has not been extended beyond Spanish−Catalan yet. All the pages that were published as articles had further edits for wikification, grammar correction, and in some cases meaningful enhancement. A deeper look at the edits revealed that the additional changes were first made by the same user who made the initial translation, and later by other editors or bots.

Wikimania in London

Amir Aharoni of the Wikimedia Language Engineering team introduces the Content Translation tool to the student delegation from Kazakhstan at Wikimania 2014, in London.

Amir Aharoni of the Wikimedia Language Engineering team introduces the Content Translation tool to the student delegation from Kazakhstan at Wikimania 2014, in London.

The Content Translation tool was showcased widely at Wikimania 2014, the annual conference of the Wikimedia communities. In the main conference, Santhosh Thottingal and Amir Aharoni presented about machine aided translation delivery through Content Translation. During the pre-conference hackathon, Pau Giner conducted a testing session with student volunteers from Kazakhstan, who were enthusiastic about using the tool in their local Wiki Club. Requests for fully supporting other language pairs were brought up by many users and groups like the Wikipedia Medical Translation project. Discussions were held with the Wikidata team to identify areas of collaboration on data reuse for consistent referencing across translated versions. These include categories, links etc.

The Language Engineering team members worked closely with Wikimedians to better understand requirements for languages like Arabic, Persian, Portuguese, Tajik, Swedish, German and others, that can be instrumental in extending support for these languages.

Further development

The development of ContentTranslation continues. Prior to Wikimania, the Language Engineering team met to evaluate the response and effectiveness of the first release of the tool, and prepared the goals for the next release. The second release is slated for the last week of September 2014. Among the features planned are support for more languages (machine translation, dictionaries), a smarter entry point to the translation UI, and basic editor formatting. It is expected that translation support from Catalan to Spanish will be activated by the end of August 2014. Read the detailed release plan and goals to know more.

Over the next couple of months, the Language Engineering team intends to work closely with our communities to better understand how the Content Translation tool has helped the editors so far and how it can serve the the global community better with the translation aids and resources currently integrated with tool. We welcome feedback at the project talk page. Get in touch with the Language Engineering team for more information and feedback.

Amir Aharoni and Runa Bhattacharjee, Language Engineering, Wikimedia Foundation