File:Content Translation Screencast (English).webm

Video: How to translate a Wikipedia article in 3 minutes with Content Translation. This video can also be viewed on YouTube (4:10). Screencast by Pau Giner, licensed under CC BY-SA 4.0

Wikimedia Foundation’s Language Engineering team is happy to announce the first version of Content Translation on Wikipedia for 8 languages: Catalan, Danish, Esperanto, Indonesian, Malay, Norwegian (Bokmål), Portuguese and Spanish. Content Translation, available as a beta feature, provides a quick way to create new articles by translating from an existing article into another language. It is also well suited for new editors looking to familiarize themselves with the editing workflow. Our aim is to build a tool that leverages the power of our multicultural global community to further Wikimedia’s mission of creating a world where every single human being can share in the sum of all knowledge.

Design

During early 2014, when the design ideas for Content Translation were being conceptualized, we came across an interesting study by Scott A.Hale of University of Oxford, on the influences and editing patterns of multilingual editors on Wikipedia. Combined with feedback from editors we interacted with, the data presented in the study guided our initial choices, both in terms of features and languages. We were fortunate to have met the researcher in person at Wikimania 2014, so we could learn more about his findings and references.

The tool was designed for multilingual editors as our main target users. Several important patterns emerged from a month-long user study, including:

  • Multilingual editors are relatively more active in Wikipedias of smaller size. Often the editors from smaller sized Wikipedias would also edit on a relatively large sized Wikipedia like English or German;
  • Multilingual editors often edited the same articles in their primary and non-primary languages.

These and other factors listed in the study impact the transfer of content between different language versions of Wikipedia; they increase content parity between versions — and decrease ‘self-focus’ bias in individual editions.

Languages

When selecting languages for the tool’s introduction, we were guided by several factors, including signs of relatively high multilingualism amongst the primary editors. The availability of high quality machine-translated content was an additional consideration, to fully explore the usability of the core editing workflow designed for the tool. Based on these considerations, Catalan Wikipedia, a very actively edited project of medium size was a logical choice. Subsequent language selections were made by studying possible overlap trends between language users — and the probability of editors benefiting from those overlaps when creating new articles. Availability of machine translation to speed up the process and community requests were important considerations.

How it works

The article Abel Martín in the Spanish Wikipedia doesn’t have a version in Portuguese, so a red link to Portuguese is shown.
Content Translation red interlanguage link screenshot by Amire80 , licensed under CC BY-SA 4.0

Content Translation combines a rich text translation interface with tools targeted for editing — and machine translation support for most language pairs. It integrates different tools to automate repetitive steps during translation: it provides an initial automatic translation while keeping the original text format, links, references, and categories. To do so, the tool relies on the inter-language connections from Wikidata, html-to-wikitext conversion from Parsoid, and machine translation support from Apertium. This saves time for editors and allows them to focus on creating quality content.

Although basic text formatting is supported, the purpose of the tool is to create an initial version of the content that each community can keep improving with their usual editing tools. Content Translation is not intended to keep the information in sync across multiple language versions, but to provide a quick way to reuse the effort already made by the community when creating an article from scratch in a different language.

The tool can be accessed in different ways. There is a persistent access point at your contributions page, but access to the tool is also provided in situations where you may want to translate the content you are just reading. For instance, a red link in the interlanguage link area (see image).

Next steps

Next steps for the tool’s future development include adding support for more – eventually all – languages, managing lists of articles to translate, and adding features for more streamlined translation.

In coming weeks, we will closely monitor feedback from users and interact with them to guide our future development. Please read the release announcement for more details about the features and instructions on using the tool. Thank you!

Amir Aharoni, Pau Giner, Runa Bhattacharjee, Language Engineering, Wikimedia Foundation