Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts by Dario Taraborelli

Seven years after Nature, pilot study compares Wikipedia favorably to other encyclopedias in three languages

This post is available in 3 languages: العربية 100% • Español 7% • English 100%

Improving the quality of articles has long been one of the primary aims of contributors to Wikipedia, and is one of the Wikimedia movement’s 2010-15 strategic priorities, but measuring it objectively has remained a challenge. In 2005, Nature famously reported that Wikipedia articles on scientific topics contained just four errors per article on average, compared to three errors per article in the online edition of Encyclopaedia Britannica. Britannica objected to the report, but Nature stood by it, and the report remains widely cited today.

Since that time, however, there have been relatively few independent analyses of Wikipedia article quality, despite the enormous growth of the project. Wikipedia today counts more than 23 million articles across languages (more than 4 million articles in the English Wikipedia alone) compared to 3.7 million total articles in 2005; today it ranks 6th by overall traffic according to Alexa, while it ranked 37th in 2005.

With increase in size and reach, how has quality evolved? How does Wikipedia compare today to other online encyclopedias, quality-wise? And what are good methods to measure the quality of encyclopedic articles?

The Wikimedia Foundation is announcing the release of a pilot study conducted by Epic, an e-learning consultancy, in partnership with Oxford University – “Assessing the Accuracy and Quality of Wikipedia Entries Compared to Popular Online Alternative Encyclopaedias: A Preliminary Comparative Study Across Disciplines in English, Spanish and Arabic.”

The study compared a sample of English Wikipedia articles to equivalent articles in Encyclopaedia Britannica, Spanish Wikipedia to Enciclonet, and Arabic Wikipedia to Mawsoah and Arab Encyclopaedia. 22 articles in the sample were blind-assessed by 2 to 3 native speaking academic experts each, both quantitatively and qualitatively.

The small size of the sample does not allow us to generalize the results to Wikipedia as a whole. However, as a pilot primarily focused on methodology, the study offers new insights into the design of a protocol for expert assessment of encyclopedic contents. For our editor community and for the Foundation, which commissioned the study in 2011, it also offers evidence to inform the design of quality assessment mechanisms and quality metrics that may be used on Wikipedia itself.

The results suggest that Wikipedia articles in this sample scored higher altogether in each of the three languages, and fared particularly well in categories of accuracy and references. As the report notes, the English Wikipedia fared well in this sample against Encyclopaedia Britannica in terms of accuracy, references and overall judgement, with little differences between the two on style and overall quality score. Similar results were found when comparing Wikipedia articles in Spanish to Enciclonet. In Arabic, Mawsoah and Arab Encyclopaedia articles scored higher on style than Wikipedia, but no significant differences were found on accuracy, references, overall judgment and overall quality score. None of the encyclopedias considered in this study were rated highly by the academics in terms of suitability for citation in academic publications.

We hope that the results of this study will encourage further independent research on the quality of Wikipedia articles. To this end, Epic and Oxford University are releasing the full version of the report of this study under a Creative Commons Attribution-Share Alike license. They have announced the report here and have released an anonymized dataset under a Creative Commons Zero dedication. The team welcomes comments and feedback on the talk page of the project.

We are very encouraged by the results for this small sample of Wikipedia articles in three languages. While pointing the way forward for further research, these results affirm the quality of the collaborative work of our editor community.

Dario Taraborelli, Senior Research Analyst

 

Siete años tras “Nature”, estudio piloto compara favorablemente a Wikipedia frente a otras enciclopedias en tres diferentes lenguas.

Hace tiempo que mejorar la calidad de los artículos es uno de los principales objetivos de los editores de Wikipedia. Es además una de las prioridades estratégicas del movimiento Wikimedia para 2010-2015, pero la capacidad de medir objetivamente este aspecto continúa siendo un desafío. En 2005, una famosa publicación de la revista “Nature” encontró que Wikipedia contenía un promedio de sólo cuatro errores por artículo sobre temas científicos contra los tres por artículo de la edición en línea de la Enciclopedia Británica. Enciclopedia Británica cuestionó el trabajo pero Nature lo reivindicó y continúa siendo citado con frecuencia hasta el día de hoy.

Desde entonces, sin embargo, hubo relativamente pocos análisis independientes de la calidad de los artículos de Wikipedia, esto a pesar del enorme crecimiento del proyecto. Wikipedia cuenta hoy con más de 23 millones de artículos en todos los idiomas (más de cuatro millones sólo en inglés) frente a los 3,7 millones de artículos en total que tenía en 2005. Hoy es el sexto sitio con mayor tráfico general según las estadísticas de Alexa, cuando en 2005 ocupaba el puesto 37. ¿Cómo evolucionó la calidad con este incremento de alcance y tamaño? ¿Cómo se compara hoy la calidad de los artículos de Wikipedia con otras enciclopedias en línea? ¿Qué métodos son apropiados para medir la calidad de un artículo enciclopédico?

La Fundación Wikimedia anuncia el lanzamiento de un estudio piloto realizado por Epic, una consultora de enseñanza en línea, en colaboración con la Universidad de Oxford: “Assessing the Accuracy and Quality of Wikipedia Entries Compared to Popular Online Alternative Encyclopaedias: A Preliminary Comparative Study Across Disciplines in English, Spanish and Arabic” (“Evaluación de la exactitud y calidad de las entradas de Wikipedia en comparación con otras conocidas enciclopedias alternativas en línea: un estudio preliminar comparativo interdisciplinario en inglés, español y árabe”).

El estudio compara una muestra de artículos de Wikipedia en inglés con sus equivalentes en la Enciclopedia Británica, Wikipedia en español con Enciclonet, y Wikipedia en árabe con Mawsoah y la Enciclopedia Árabe. 22 artículos de cada una de estas obras fueron presentados a dos o tres expertos académicos hablantes nativos de estas lenguas, quienes las evaluaron en términos cuantitativos y cualitativos.

Lo pequeño de la muestra nos impide generalizar los resultados a toda Wikipedia. Sin embargo, desde lo metodológico, el estudio ofrece nuevas líneas para el diseño de un protocolo que permita la revisión por expertos de contenido enciclopédico. También brinda a nuestra comunidad de editores y a la Fundación, que encargó el estudio en 2011, información para respaldar el diseño de mecanismos de control y medición de calidad que pueden ser usados en la propia Wikipedia.

Los resultados sugieren que los artículos de Wikipedia muestreados tienen en general un puntaje superior a sus contrapartes en los tres idiomas evaluados, con un desempeño especialmente bueno en cuanto a exactitud y provisión de referencias. Según destaca el informe Wikipedia en inglés se compara positivamente frente a la Enciclopedia Británica en términos de exactitud, referencia y juicio general, con una pequeña diferencia de puntaje entre ambas en estilo y calidad general. Los resultados de la comparación entre Wikipedia en español y Enciclonet fueron similares. En árabe, los artículos de Mawsoah y la Enciclopedia Árabe superaron a Wikipedia en cuanto a estilo, pero no se encontraron diferencias significativas en exactitud, referencias, juicio ni calidad general. Los expertos no consideraron que ninguna de las enciclopedias evaluadas fuera superior a las demás en cuanto a la posibilidad de cita en publicaciones académicas.

Esperamos que los resultados del estudio incentiven posteriores investigaciones independientes sobre la calidad de los artículos de Wikipedia. Para contribuir a ese fin Epic y la Universidad de Oxford publican la versión completa del informe con licencia Creative Commons Atribución-CompartirIgual. Con licencia Creative Commons Zero se publica también una colección de información anónima generada por el estudio. El equipo de trabajo espera comentarios y retroalimentación en la página de discusión del proyecto.

Estamos muy motivados por los resultados de esta pequeña muestra de artículos de Wikipedia en tres idiomas. Aún cuando abren un camino para la investigación futura, estos resultados confirman la calidad del trabajo colaborativo de nuestra comunidad de editores.

Dario Taraborelli, analista de investigación senior

 

بعد سبع سنوات من دراسة مجلة نيتشر، دراسة جديدة تقارن محتويات ويكيبيديا بموسوعات أخرى بثلاث لغات

إن تطوير جودة المحتويات هو أحد الأهداف الرئيسية للمساهمين في ويكيبيديا، وأحد أهداف الخطة الخمسية الاستراتيجية لحركة ويكيبميديا بين الأعوام ٢٠١٠-٢٠١٥، إلا أن قياس تلك الأهداف بشكل موضوعي كان ولازال أحد التحديات القائمة. في عام ٢٠٠٥ قامت مجلت نيتشر بنشر مقالة عرضت بأن مقالات ويكيبيديا احتوت ٤ أخطاء بالمعدل في مقابل ٣ أخطاء في مقالات موسوعة بريتانيكا على الإنترنت. اعترضت بريتانيكا على التقرير إلا أن مجلة نيتشر أصرت عليه، ولا زال التقرير واسع الانتشار حتى اليوم.

منذ ذلك الحين ظهر فقط بعض الدراسات التحليلية عن جودة محتويات ويكيبيديا، على الرغم من التوسع الكبير للمشروع. يبلغ تعداد مقالات ويكيبيديا اليوم ما يزيد على ٢٣ مليون مقالة عبر اللغات المتعددة (أكثر من ٤ مليون مقالة منها باللغة الإنكليزية وحدها) بالمقارنة مع ٣.٧ مقالة بالمجموع في عام ٢٠٠٥، تحتل ويكيبيديا اليوم وفقا لترتيب موقع ألكسا المركز السادس من حيث عدد الزيارات، بينما كان ترتيبها ٣٧ في عام ٢٠٠٥.

ومع الزيادة في الحجم والانتشار، فيكون التساؤل المطروح عن تغير جودة المحتويات؟ كيف من الممكن مقارنة ويكيبيديا اليوم بالموسوعات الأخرى المنتشرة على الإنترنت من حيث الجودة؟ وما هي الطرق المثلى لقياس جودة المقالات الموسوعية؟

وهنا تعلن مؤسسة ويكيميديا عن إطلاق دراسة أولية تم القيام بها من قبل مجموعة إيبك الاستشارية بالمشاركة مع جامعة أوكسفورد تحت عنوان “تقييم دقة وجودة مقالات ويكيبيديا بالمقارنة مع موسوعات الإنترنت المنتشرة الأخرى : دراسة مقارنة أولية باللغات الإنكليزية والإسبانية والعربية”

قامت الدراسة بمقارنة نماذج من ويكيبيديا الإنكليزية مع مقالات مقابلة من موسوعة بريتانيكا، وويكيبيديا الإسبانية بمقالات مقابلة من موسوعة إينسيسلونيت، وويكيبيديا العربية مع الموسوعة العربية العالمية، والموسوعة العربية، حيث تم تقييم عينة من ٢٢ مقالة من قبل ٢ – ٣ متحدثين أصليين باللغات الثلاث من المجتمع الأكاديمي وذلك من حيث الكم والجودة.

إن حجم العينة الصغير لا يسمح بتعميم النتائج على ويكيبيديا عموما. إلا أن الدراسة الأولية ركزت بشكل رئيسي على الطريقة، كما أن الدراسة طرحت تصميم جديد لتقييم الخبراء للمحتويات الموسوعية. كما أن الدراسة تقدم لمجتمع ويكيبيديا ولمؤسسة ويكيميديا التي مولت الدراسة في عام ٢٠١١ دليلا يساعد على تصميم آليات لتقييم الجودة ووضع معايير لها لاستخدامها على ويكيبيديا نفسها.

تلخص الدراسة إلى أن مقالات ويكيبيديا سجلت علامات أعلى بشكل عام في كل من اللغات الثلاث، وتميزت بشكل خاص في فئتي الدقة والمراجع المستخدمة. وكما يشير التقيري إلى ويكيبيديا الإنكليزية حققت علامات جيدة مقابل موسوعة بريتانيكا من ناحية الدقة واستخدام المراجع والتقييم العام مع فروقات صغيرة بين من حيث التنسيق وعلامة الجودة الكلية. كما أن نتائج مماثلة تم الوصول إليها عند مقارنة ويكيبيديا الإسبانية مع موسوعة إينسيسلونيت. وفي اللغة العربية فقد حققت الموسوعة العربية العالمية والموسوعة العربية علامات أعلى من ويكيبيديا من حيث التنسيق، لكن لم يكن هناك أي فروقات من حيث الدقة، استخدام المراجع، أو التقييم الكلي للجودة. ولم تحصل أي من الموسوعات في هذه الدراسة على علامة عالية من حيث قابليتها للاستخدام كمرجع في الأبحاث الأكاديمية.

نحن نأمل بأن نتائج هذه الدراسة ستشجع أبحاث مستقلة أخرى حول مواضيع تقييم جودة مقالات ويكيبيديا. إن إيبك وجامعة أوكسفورد تنشران النسخة الكاملة من التقرير تحت رخصة المشاع الإبداعي. كما تم نشر نسخة ببيانات مجهولة الأسماء تم توليدها من قبل هذه القائمة تحت رخصة المشاع الإبداعي صفر. إن فريق الدراسة يرحب بالملاحظات والتقييم على صفحة نقاش المشروع.

لقد شجعتنا هذه النتائج عن مقالات ويكيبيديا بلغات ثلاث بشكل كبير. وبالإضافة إلى أنها تسهل الطريق إلى أبحاث مستقبلية أخرى، فإن هذه النتائج تؤكد على جودة العمل المشترك لمحرري مجتمع ويكيبيديا.

Dario Taraborelli, Senior Research Analyst

What MoodBar tells us about new registered editors

One of the three types of mood that
MoodBar users can share.

MoodBar is an experimental tool designed to collect feedback from newly registered users about their first experience editing Wikipedia. Its main purpose is to take the pulse of new editors in real time and address issues they encounter when they first hit the edit button. However, MoodBar also gives us a unique opportunity to measure the activity of these new contributors. While we know that levels of early activity vary widely, our results indicate that contributors who share their mood about their first editing experience represent a particularly active and prolific group of new editors:

  • MoodBar users are twice as productive as active users who never sent any feedback.
  • Receiving a “helpful” response to MoodBar feedback is associated with an edit count four times higher than that of an average active user who never sent any feedback.

In this short series of posts we present results from studies on the engagement of newly registered users using MoodBar.

MoodBar usage and early editor activity

Some time has passed since the introduction of MoodBar. The last time we wrote about it was to introduce the Feedback dashboard, a tool that allows community members to respond to issues reported by MoodBar users. In December 2011, we added another small feature, the Mark as “Helpful” extension, which allows MoodBar users to rate the helpfulness of the response they received.

Since last December, 16,810 feedback messages have been posted using MoodBar by 14,568 new registered users. Of these, 5,767 have received at least one response so far, 961 of which have been rated as “Helpful” by the original posters using the Mark-as-Helpful button. Of all the new editor accounts created since August 2011, when we deployed the first version of MoodBar, roughly 2% have posted feedback using this tool.

These numbers allowed us to study whether MoodBar – a feature that is activated when a new user clicks on the edit button for the first time – has any positive impact on new editor engagement. In this post we will tackle the following research question: what does MoodBar usage tells us about the behavior of new editors and, in particular, is there any difference in early activity between those users who share their mood and those who don’t?

What we learned

Figure 1. Average edit count at 30 days of activity for MoodBar users by treatment (green bars) compared to active users who never sent feedback (orange bar).

To answer the above question, we analyzed the edit count of new registered users during their first 30 days of activity. Our sample includes both users who sent feedback using MoodBar and users who did not, despite it being activated. We call the latter group of active users (i.e. new users who clicked the edit button at least once) our Reference group. Among those who did post feedback, we further distinguish three categories:

  • Feedback: new users who posted feedback but did not receive a response from the Feedback Dashboard response team;
  • Feedback+Response: new users who did receive a response but did not mark it as helpful (whether they really didn’t find it helpful or simply never saw it);
  • Feedback+Helpful: new users who did receive a response and marked it as helpful.

The results are surprising. Figure 1 plots the average edit count after 30 days since the activation of MoodBar. We can see that MoodBar users (green bars) do indeed contribute to Wikipedia more than the reference group of non-MoodBar users (orange bars). However, those who have received a response but have not marked it as “helpful” (Feedback+Response) seem to contribute slightly less than those who simply posted feedback but never received a response. This seems counter-intuitive: even assuming that a large fraction of users in the former group was not aware of the possibility of receiving a reply, and hence did not know that somebody had responded to their call for help, how can it be that receiving a response is associated with a lower productivity than not receiving a response at all?

To shed some light on these preliminary findings we ran a more controlled series of tests.

We performed a regression analysis of the edit count of users in our sample over their first 30 days of activity to control for possible external factors that we were not taking into account. The main results of the analysis are the following:

  • MoodBar users who received a response not marked as helpful (Feedback+Response) are indeed slightly less productive (-6%) than those who only posted feedback but did not receive a response (Feedback).
  • MoodBar users who received a helpful response (Feedback+Helpful) are 41% more productive than MoodBar users (Feedback), and 400% more productive than non-MoodBar users (Reference).
  • Different types of reported mood are associated with different levels of productivity, but in a very limited way:
    • For all groups, reporting a “confused” mood is associated with a 4% decrease in productivity compared to reporting a “happy” mood.
    • For users in the Feedback+Helpful group reporting a “sad” mood is associated with a 16% lower edit count than reporting any other mood.


Identifying natural-born Wikipedians

These results are very encouraging for both the MoodBar team and the community members who use the Feedback Dashboard and volunteer in the Response Team – to whom we are grateful for their priceless work at welcoming and responding to new registered users. However, it is natural to ask what is the cause of the effects we observed. Are natural-born Wikipedians particularly good at posting feedback and receiving helpful responses, or are good feedback and helpful responses what really makes for a good Wikipedian? Or in other words: is MoodBar increasing the productivity of new editors or just helping identify the most productive ones?

We are currently running a controlled experiment to answer this question. We collected data on a sample of newly registered editors who did not have MoodBar enabled by default on their accounts, and we intend to compare them with a group of users who had MoodBar enabled as usual. At the same time, we issued a call to action to the response team in order to clear the backlog of unanswered feedbacks. We hope that this will give us enough data to be able to perform a statistically significant comparison and understand whether MoodBar has a genuine effect on new users or is simply a good detector of natural-born Wikipedians.

Read more about the controlled experiment.


Get in touch

You can follow this feed for further updates on MoodBar research or send us your feedback on this study.

Dario Taraborelli, Sr Research Analyst, Strategy
Giovanni Luca Ciampaglia, Research Analyst (contractor)

Converting readers into editors: New results from Article Feedback v5

An invitation to “edit this page” is shown after users post feedback on Wikipedia (‘Call to Action 1′)

Since December 2011, the Wikimedia Foundation has been testing a new version of the Article Feedback Tool, a feature first introduced on the English Wikipedia in 2010. The goal of version 5 (AFTv5) is to engage Wikipedia readers to become more active contributors, by inviting them to provide feedback on articles they read, and encouraging them to become editors over time. 

Early tests of AFTv5 helped us answer the question of what design of the tool produces a desirable balance between volume and usefulness of the feedback collected. In this post we report results from two additional experiments designed to answer the following questions:

  1. Does a prominent invitation to use the tool affect the usefulness of submitted feedback?
  2. How does an invitation to leave feedback affect the conversion of readers into editors?

Our findings suggest that a prominent invitation to post feedback converts a significant number of readers into editors. These new editors appear less productive than other first-time Wikipedians; but their feedback appears just as useful, as below. These findings suggest that article feedback can increase the number of new editors on Wikipedia and can also help existing editors improve the encyclopedia based on reader feedback.

Prominence of Feedback Invitation

(more…)

Wikimedia Foundation endorses mandates for free access to publicly funded research

A photo from a 1973 London School of Economics appeal for funds for its library.

Scholarly information is often too expensive to access. Academic publishers sell journal subscriptions for thousands of dollars per journal per year. Typically, only universities and large libraries, not individuals, are able to pay those fees, which limits access to researchers and others affiliated with institutions with money.

Are these costs justifiable when the underlying research is publicly funded and the underlying goal is public knowledge? If you’re a taxpayer you’ve already paid to fund the research, so why should you pay essentially another tax to read the findings of that research?

On May 20, a team of longtime advocates for public access to scholarly information launched a campaign to urge U.S. President Barack Obama to “require free access over the Internet to journal articles arising from taxpayer-funded research.” Opening up publicly-funded research will “provide access to patients and caregivers, students and their teachers, researchers, entrepreneurs, and other taxpayers who paid for the research.”

This is consistent with Wikimedia’s non-profit mission “to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.”

Click to view a video by the Scholarly Publishing and Academic Research Coalition about the petition

Wikimedia project volunteers, who are among the taxpayers, should not be denied free access to this information. They should be empowered to read it, report on it, and cite it. Wikipedia and its sister projects depend on the energy and unselfish dedication of this team of contributors – volunteers, researchers, and amateurs – who read and investigate sources as they work to compile accurate, up-to-date, verifiable knowledge. Each month, hundreds of millions of global readers view, and have the opportunity to evaluate and contribute to Wikimedia content. Many do not have the means (nor should they be required) to pay for knowledge, including useful economic, health and scientific information when their taxes fund the research.

We believe in open access and free licensing as fundamental forces to disseminate knowledge, support education and accelerate discovery.

Today, the Wikimedia Foundation is endorsing this petition, joining thousands of individuals and organizations expressing support for free access to taxpayer-funded research articles. We hope you will join us, too—anyone over age 13 can sign (and you do not need to be a US citizen).

Please consider signing this petition to mandate that all research funded by U.S. taxpayers be made freely available to the citizens of the Web.

Dario Taraborelli, Senior Research Analyst, Wikimedia Foundation
Geoff Brigham, General Counsel, Wikimedia Foundation
Kat Walsh, Member of the Wikimedia Board of Trustees

Helping readers improve Wikipedia: First results from Article Feedback v5

Figure 1. One of the feedback forms tested in the AFTv5 experiments (Option 1).

 

The Wikimedia Foundation, in collaboration with editors of the English Wikipedia, is developing a tool to enable readers to contribute productively to building the encyclopedia. To that end, we started development of a new version of the Article Feedback Tool (known as AFTv5) in October 2011. The original version of the tool, which allows readers to rate articles based on a star system, launched in 2010. The new version invites readers to write comments that might help editors improve Wikipedia articles. We hope that this tool will contribute to the Wikimedia movement’s strategic goals of increasing participation and improving quality.

Testing new feedback forms

On December 22, 2011, we started testing three different designs for the AFTv5 feedback forms:

  • Option 1: Did you find what you were looking for? (shown above)
  • Option 2: Make a suggestion, give praise, report a problem or ask a question
  • Option 3: Rate this article

The purpose of this first experiment was to measure the type, usefulness and volume of feedback posted with these feedback forms. For example, does asking a reader to describe what they were looking for (option 1) provide more actionable feedback than asking them to make a suggestion (option 2)?

We enabled AFTv5 on a small, randomly selected set (0.6%) of articles on the English Wikipedia, as well as a second set of high-traffic or semi-protected articles. A feedback form, randomly selected from the above three options, was placed at the bottom of each page. The feedback form was also accessible via a link docked on the bottom right corner of the page.  The resulting comments were then analyzed along a number of dimensions.

(more…)

Wikimedia Research Newsletter, February 2012

Wikimedia Research Newsletter

Vol: 2 • Issue: 2 • February 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Gender gap and conflict aversion; collaboration on breaking news; effects of leadership on participation; legacy of Public Policy Initiative

With contributions by: Tbayer, Piotrus, Jodi.a.schneider, Hfordsa and DarTar

Contents

Wikipedia research at CSCW 2012

The annual 15th ACM conference on computer-supported cooperative work (CSCW 2012) featured two sessions about Wikipedia Studies. The first one was titled “Scaling our Everest” (in amusing contrast to an earlier metaphor for the role of Wikipedia in that field of research: “the fruit fly of social software”), and covered four papers. A second session likewise comprised four papers and notes. Below are some of the highlights from these two sessions.

Gender gap connected to conflict aversion and lower confidence among women

The Gender Gap hub on Meta.

Since January 2011, Wikipedia’s “Gender gap” has received much attention from Wikimedians, researchers and the media – triggered by a New York Times article that cited the estimate that only 12.64% of Wikipedia contributors are female. That figure came from the 2010 UNU-MERIT study, which was based on the first global, general survey of Wikipedia users, conducted in 2008 with 176,192 respondents using a methodology that had raised some questions (e.g. about sample bias and selection bias), but other studies found similarly low ratios. A new paper titled “Conflict, Confidence, or Criticism: An Empirical Examination of the Gender Gap in Wikipedia”[1] has now delved further into the data of the UNU-MERIT study, examining the responses to questions such as “Why don’t you contribute to Wikipedia?” and “Why did you stop contributing to Wikipedia?”, finding strong support for the following three hypotheses:

(more…)

An experiment on decision making open to Wikipedians

Berkman Center logoSciences Po logo A team of researchers at the Berkman Center for Internet & Society at Harvard University and Sciences Po Paris, led by Professor Yann Algan and Wikimania 2011 keynote speaker Professor Yochai Benkler, invites English Wikipedia contributors to participate in an interactive online experiment on decision making. The goal of this study is to better understand the dynamics of interactions and behavior in online social spaces. The project has been reviewed by the Wikimedia Research Committee and extensively discussed on the English Wikipedia Administrators’ noticeboard. The Wikimedia Foundation is happy to support this project which we believe will help advance research on our community.

Starting today, logged-in eligible editors will receive an invitation to participate in this study via a CentralNotice banner. To reduce banner overload, the invitation banner has been coded so as to be displayed only to a sample of English Wikipedia contributors meeting eligibility conditions for this study. If you are among the editors selected for this study, please consider participating. If you disable the banner, it will not be displayed to you anymore.

The survey takes about 25 minutes to complete. It combines a series of interactive experiments on decision making with questions about attitudes and practices. Based on their decisions and those of other participants in the study, participants will earn money that they can then choose to donate to the International Committee of the Red Cross or the Wikimedia Foundation if they so wish.

The data collected by this study is subject to European privacy protection protocols and will be used for research purposes only. All research outputs and data, while preserving full anonymity of participants, will be made available under an open access license. The research team will present its findings at a Wikimania conference.

Dario Taraborelli, Senior Research Analyst, Strategy

Note: in order not to influence other participants and ensure the validity of the study, the Team would like to request participants not to post any comments that discuss the actual content of the survey on this page. The research team is happy to receive your comments and answer any questions you may have at berkman_harvard@sciences-po.fr.

Joining forces with open science

open access logo

The Open Access logo

The open science movement is fighting to make scientific research – especially publicly funded research – more transparent, freely accessible and reusable. The goals of open science are closely aligned with our mission, yet for years there has been little institutional contact between our movement and initiatives such as Open Access and Open Data. Joining forces with individuals and organizations who are working to promote a culture of openness in the scientific community should be high on our agenda.

How can we achieve this goal? The Wikimedia Foundation is currently working on a set of policies to enforce the release of its research data and research output in the open and to incentivize researchers who seek our support or collaboration to do the same. More importantly, today we are thrilled to announce that our community is in a stronger position to bridge the gap with the open science movement. Daniel Mietchen – a biophysicist based in Germany, outspoken open data and open access advocate, and active member of the Wikimedia Research Committee – is the recipient of a grant from the Open Society Foundations and will become the first Wikimedian in Residence on Open Science with a focus on Open Access (OA).

The WiR program has been an immense success in the context of other initiatives such as GLAM. But what exactly is the mission of a Wikimedian in Residence on Open Science? In Daniel’s words, “a Wikimedian in Residence is someone trusted by and in good contact with both the Wikimedia and the partner communities who can guide article development on the target topics and help to keep in focus the common goals, in our case: improving Open Access coverage and reuse
in WMF projects”.

As Daniel reports in his programmatic blog post, content from Open Access publishers is already widely used on Wikimedia projects, yet traditional publishers still receive way more citations from Wikipedia articles than their open counterparts. There are lots of one-time image and media donations to Wikimedia but ongoing donations from reusably licensed OA sources have not received adequate attention yet. Likewise, contents from suitably licensed text sources are systematically being used in WMF projects, but OA sources much less so.

Anatosuchus

Reconstruction of Anatosuchus minor. A CC-BY licensed image from an Open Access article, uploaded to Wikimedia Commons

Daniel’s mission is to facilitate the reuse of materials from Open Access articles in WMF projects, to improve coverage of topics related to Open Access in the English Wikipedia, to support the implementation of the WMF’s Open Access policy and to explore the potential for the WMF community to collaborate with Open Access, Open Science and Open Knowledge initiatives in general. In the long run, the project is designed to extend beyond Open Access and into Open Science proper, as well as into other languages and possibly other collaborative projects. The directions this project ultimately aims to explore, and how to go about the exploration, will be determined in part on the basis of community feedback received during the pilot phase. The host of the project is the Open Knowledge Foundation Germany, which will also act as a content partner and as a contact point for external expertise on matters of Open Knowledge, especially Open Data.

How can you help support this initiative?

You can follow the development of the OA movement via the OA Tracking Project and Daniel’s work via his dedicated blog, the WiR-OS page on Meta and Twitter: @EvoMRI

Daniel Mietchen, Wikimedian in Residence on Open Science
Dario Taraborelli, Senior Research Analyst

Introducing the Wikimedia Research Index

Wikimedia is in the exceptional position of having a thriving community of researchers who have been studying every single possible aspect of its projects for nearly a decade. Wikipedia as a topic for scholarly research, in particular, has seen a dramatic growth over the last few years, partly thanks to the effort of venues and communities such as WikiSym. Manually curated lists of scholarly studies on Wikipedia show a steady growth in attention in the academic community but probably underestimate the actual volume of scholarly publications on Wikipedia that get published every year (a search of the ACM digital library indicates that 82 papers were published in conference proceedings in 2010 with Wikipedia as a keyword in the title)

Despite this growth, resources for researchers and information about research of Wikimedia projects have been incomplete, unmaintained and scattered. Support for researchers from the Foundation has been ad-hoc and for a long time there hasn’t been a team in charge of reviewing external support requests or to facilitate collaboration with external researchers.

To answer to these problems the Research Committee recently started to rebuild the research documentation available on Meta. Today we are proud to announce the first version of the Wikimedia Research Index, the single go-to point for all research-related needs at Wikimedia.
Wikimedia Research Index screenshot

The main purpose of a research index is to centralize documentation on research of Wikimedia projects, but also to create a place for the community to discuss and learn about this research. The Wikimedia Research Index will:

  • provide documentation on resources for Wikimedia researchers, including datasets, tools and code libraries, conferences and events
  • act as a point of contact for researchers with each other and the Foundation (by complementing wiki-research-l)
  • formalize support for research projects and specify what the Foundation expects from the projects it supports
  • host research policies and guidelines
  • track research projects (both initiated by the Foundation and by the research community) that study Wikimedia contents and communities or that build innovative results and applications on top of Wikimedia data

These are some highlights from the Research Index:

  • we have been working on a set of policies to ensure that research supported in different forms by the Foundation is released in the open (with respect not only to its output, but also to code and data). The new open access/open data policy of the Foundation will be announced in a separate post.
  • as part of this work, we will be announcing soon the first in a series of monthly research newsletters covering the most recent updates in Wikimedia research, modeled around the Signpost
  • we will be highlighting via the Research Index, the Foundation’s blog and the research newsletter a series of featured projects that touch on issues of particular strategic importance. The first featured project is the Wikimedia Summer of Research, hosted by WMF Community department
  • we created a dedicated IRC channel on Freenode as a friendly place to discuss in real time issues of relevance to Wikimedia research

The Research Index is, by definition, a constant work in progress and there are several ways in which you can help us improve it: as a researcher, by making sure that your past and current projects are documented in the research project directory and by bringing to our attention to any results, calls for papers and research-related initiatives we should be aware of (particularly if you wish to have them included in the research newsletter); as a community member by participating in project-specific discussions, by highlighting issues that are particularly sensitive from a community perspective and by suggesting topics and issues in search of an answer from the research community.

We hope with this initiative to increase the volume, speed, impact and potential audience of research that helps improve our understanding of Wikimedia projects and communities.

Dario Taraborelli, Senior Research Analyst, Strategy

WikiViz 2011: Visualizing the impact of Wikipedia

To celebrate the 10-year anniversary of Wikipedia, and its impressive growth in content, quality, diversity, and readership, the International Symposium on Wikis and Open Collaboration (WikiSym) and the Wikimedia Foundation (WMF) are jointly launching WikiViz 2011 – a call for data/information visualization experts, computational journalists, data artists and data scientists to create the most insightful visualization of Wikipedia’s impact.

WikiViz 2011 is about visualizing the impact of Wikipedia using open data. We want to see the most effective, compelling and creative data-driven visualizations of how Wikipedia impacted the world with its content, culture and open collaboration model. Potential topics include: the imprint of Wikipedia on knowledge sharing and access to information; its impact on literacy and education, journalism and research; on the functioning of scientific and cultural organizations and businesses, as well as the daily life of individuals around the world. In addition, we want to see visualizations of areas of knowledge, geographical regions, organizations and people Wikipedia has not been able to reach or has impacted less than one would have expected. In summary, the main goal of this competition is to improve our understanding of how Wikipedia is affecting the world beyond the scope of its own community.

Awards

The WikiViz 2011 Awarding Ceremony will take place on October 4, at WikiSym 2011 main venue, Microsoft Research Silicon Valley campus (Mountain View, California). The ceremony will open with keynote speaker Jeff Heer (Stanford University), on the impact of emerging visualization techniques to understand open collaboration today.

Three finalist teams (1 winner, 2 runners-up) will be invited to present their work at WikiSym 2011, in Mountain View (California). Travel expenses and registration fees will be covered for one delegate per finalist team. The submissions from these three teams will be showcased at the WikiSym 2011 exhibit, presented during the WikiViz awards ceremony and featured by our Knowledge and Media Partners (Unidad Editorial, Periscopic, Information Aesthetics, Visualizing.org and Flowing Data).

Furthermore, Spanish media group Unidad Editorial will run a voting process in September, among the visitors of El Mundo.es, (the largest digital newspaper in Spanish by readership worldwide), to select the “Public’s choice” visualization among the top 10 submissions received. The winner will be featured in the digital edition of El Mundo.

Jury

The finalists will be selected by a jury composed of world-class experts in data visualization and social computing:

How to participate

Please, refer to the WikiViz call for participation to learn more details about terms and conditions to participate, submission instructions, selection rules and evaluation criteria. Only entries based on open data and licensed under a Wikimedia Commons-compatible open license will be considered.

Important dates

  • June 29, 2011: Challenge call for submissions.
  • August 28, 2011: Submission deadline (extended).
  • September 12, 2011: Winner and finalist submissions announced.
  • October 4, 2011: WikiViz awards session, WikiSym 2011 (Mountain View, CA).

Contact

For any questions, comments or interest in supporting or collaborating with this challenge, please contact the co-organizers at: wikiviz2011@easychair.org

You can also follow us on Twitter: @WikiViz (tag your tweets with #wikiviz11).

More

WikiViz 2011 is the second of two data challenges the Wikimedia Foundation is organizing this summer. If you are interesting in building predictive models of Wikipedia editor activity, check out the Wikipedia participation challenge

Organizers

WikiSym Wikimedia Foundation

Media Sponsors

El Mundo.es

Knowledge Partners

infosthetics FlowingData.com
visualizing.org Periscopic