This article is available in: English العربيةEspañol

Improving the quality of articles has long been one of the primary aims of contributors to Wikipedia, and is one of the Wikimedia movement’s 2010-15 strategic priorities, but measuring it objectively has remained a challenge. In 2005, Nature famously reported that Wikipedia articles on scientific topics contained just four errors per article on average, compared to three errors per article in the online edition of Encyclopaedia Britannica. Britannica objected to the report, but Nature stood by it, and the report remains widely cited today.

Since that time, however, there have been relatively few independent analyses of Wikipedia article quality, despite the enormous growth of the project. Wikipedia today counts more than 23 million articles across languages (more than 4 million articles in the English Wikipedia alone) compared to 3.7 million total articles in 2005; today it ranks 6th by overall traffic according to Alexa, while it ranked 37th in 2005.

With increase in size and reach, how has quality evolved? How does Wikipedia compare today to other online encyclopedias, quality-wise? And what are good methods to measure the quality of encyclopedic articles?

The Wikimedia Foundation is announcing the release of a pilot study conducted by Epic, an e-learning consultancy, in partnership with Oxford University – “Assessing the Accuracy and Quality of Wikipedia Entries Compared to Popular Online Alternative Encyclopaedias: A Preliminary Comparative Study Across Disciplines in English, Spanish and Arabic.”

The study compared a sample of English Wikipedia articles to equivalent articles in Encyclopaedia Britannica, Spanish Wikipedia to Enciclonet, and Arabic Wikipedia to Mawsoah and Arab Encyclopaedia. 22 articles in the sample were blind-assessed by 2 to 3 native speaking academic experts each, both quantitatively and qualitatively.

The small size of the sample does not allow us to generalize the results to Wikipedia as a whole. However, as a pilot primarily focused on methodology, the study offers new insights into the design of a protocol for expert assessment of encyclopedic contents. For our editor community and for the Foundation, which commissioned the study in 2011, it also offers evidence to inform the design of quality assessment mechanisms and quality metrics that may be used on Wikipedia itself.

The results suggest that Wikipedia articles in this sample scored higher altogether in each of the three languages, and fared particularly well in categories of accuracy and references. As the report notes, the English Wikipedia fared well in this sample against Encyclopaedia Britannica in terms of accuracy, references and overall judgement, with little differences between the two on style and overall quality score. Similar results were found when comparing Wikipedia articles in Spanish to Enciclonet. In Arabic, Mawsoah and Arab Encyclopaedia articles scored higher on style than Wikipedia, but no significant differences were found on accuracy, references, overall judgment and overall quality score. None of the encyclopedias considered in this study were rated highly by the academics in terms of suitability for citation in academic publications.

We hope that the results of this study will encourage further independent research on the quality of Wikipedia articles. To this end, Epic and Oxford University are releasing the full version of the report of this study under a Creative Commons Attribution-Share Alike license. They have announced the report here and have released an anonymized dataset under a Creative Commons Zero dedication. The team welcomes comments and feedback on the talk page of the project.

We are very encouraged by the results for this small sample of Wikipedia articles in three languages. While pointing the way forward for further research, these results affirm the quality of the collaborative work of our editor community.

Dario Taraborelli, Senior Research Analyst