UNISA_Main_Campus
The University of South Africa, seen here, has been the backdrop for several recent and successful Wikipedia editing workshops. Photo by A. Bailey, freely licensed under CC BY-SA 3.0.

Content Translation is starting to change how people are joining Wikipedia. I saw this up close in early June at the University of South Africa (UNISA) in Pretoria, where I went to advance Wikipedia writing in the languages of that country.

It all began when I met Laurette Pretorius, a professor of computer science at UNISA, at the Multilingual Web Workshop in Madrid in May 2014. My team went there to present a preview of Content Translation, which back then was in the very early stages of development. In her presentation, Prof. Pretorius described the work that her team has been doing on improving the encyclopedia’s coverage of the languages of South Africa, such as Zulu, Xhosa, and Afrikaans with modern tools like Natural Language Processing, digital dictionaries, educational materials, machine translation, and linked data. As this sounded very close to some of the things that Wikimedia is trying to achieve with Content Translation and Wikidata projects, we exchanged emails.

Laurette Pretorius is working on ways to better represent South African languages online. Photo by Petterual, freely licensed under CC BY-SA 4.0.

After several conversations, Laurette decided not only to start using the editions of Wikipedia in the different languages of South Africa more in her team’s work, but also to bring them to attention of other departments in the University. She decided to organize an all-day workshop for students and staff members, and an additional translation workshop.

The first workshop was attended by about fifty people. It opened with introductions from Prof. Lesiba Teffo, director of the Unisa School of Transdisciplinary Research Institutes, who called upon all speakers of African languages to embrace modern technology and improve the online educational content in their languages. He was followed by Prof. Pretorius, who spoke about the importance of having a well-developed Wikipedia community for education in any language and cited works by Neville Alexander and András Kornai. Friedel Wolff, who is well-known in the free software internationalization community as the developer of the Pootle and Virtaal localization tools, and who now works on Prof. Pretorius’s team, presented general advice to translators of Wikipedia articles.

My central presentation was a demonstration of Content Translation in action. Content Translation is now enabled as a beta feature on every language-edition of Wikipedia. After a short general explanation about the tool’s features, I invited Nozibele Nomdebevana—a researcher of the Xhosa language in Laurette’s team, and the most active contributor to the Xhosa Wikipedia in the last year—to translate the article “Distance” straight to the Xhosa Wikipedia using Content Translation. In just a bit more than an hour, including all the explanations and the questions from the audience, the article was ready and published.

The greatest moment for me personally happened when I asked the Xhosa-speaking audience members what they thought about the text of the article that was taking shape on the projected screen in front of them, and a young woman remarked, “it’s perfect.”

The next day, I led another smaller workshop focused on practical hands-on translation of articles. It was attended by nine people, only two of whom had any experience with writing on Wikipedia. In about six hours, fourteen new articles had been added to the Xhosa, Zulu, Tswana, Sotho, Afrikaans, and French—which one of Laurette’s students speaks—Wikipedias. Most of them were fairly complete in their first published versions: between 2 and 6kb long with illustrations and references.

I have led dozens of Wikipedia editing workshops, and this was the most productive in terms of the amount of content created. It felt really effective: usually, I have to spend a lot of time explaining how to do basic things, such as creating articles and adding links, categories and images. With Content Translation, they were already well into writing actual content after just ten minutes.

Among the translated articles are “Phonetics” (in both Zulu and Xhosa), “Apartheid”, “Mikhail Lomonosov” (a Russian scientist and scientific terminology creator), and “Lightning bird” (a mythological creature in the culture of South Africa).

This is a taste of the things to come. With Content Translation, more content can be created in more languages with less effort. It’s easier than ever for new Wikipedia contributors to join in the fun and share their knowledge with the people who speak their languages.

Amir E. Aharoni, Product Manager, Language Engineering team, Wikimedia Foundation