Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Wikimedia Research Newsletter

Wikimedia Research Newsletter, August 2012

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png


Vol: 2 • Issue: 8 • August 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

New influence graph visualizations; NPOV and history; ‘low-hanging fruit’

With contributions by: Piotrus, Ragesoss, Evan, DarTar, Tbayer and OrenBochman

Contents

Wikipedia-based graphs visualize influences between thinkers, writers and musicians

A visualization of musical genres related to psychedelic music, based on DBPedia data.

In a blog post titled “Graphing the history of philosophy”,[1] Simon Raper of the company MindShare UK describes how he constructed an influence graph of all philosophers using the “Influenced by” and “Influenced” fields of Template:Infobox philosopher (example: Plato). This information was retrieved using DBpedia with a simple SPARQL query. After some cleanup, the result, consisting of triplets in the form <Philosopher A, Philosopher B, Weight> was processed using the open source graph visualization package Gephi to create an impressive overview of the philosophers within their respective spheres of influence.

Brendan Griffen extended the idea to “everyone on Wikipedia. Well, everyone with an infobox containing ‘influences’ and/or ‘influenced by’”, arriving at a huge, far more dense “Graph Of Ideas” including not only philosophers, but also novelists, fantasy and science fiction writers, and comedians.[2] In another blog post,[3] Griffen added transitive links as well – so that each person is considered to be influenced both directly and indirectly. The most connected people in the graph were ancient Greek thinkers, with Thales, Pythagoras and Zeno of Elea occupying the top three spots. Griffen remarks that this vindicates a statement in Bertrand Russell‘s History of Western Philosophy (1945): “Western Philosophy begins With Thales”.

(more…)

Wikimedia Research Newsletter, July 2012

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png


Vol: 2 • Issue: 7 • July 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Conflict dynamics, collaboration and emotions; digitization vs. copyright; WikiProject field notes; quality of medical articles; role of readers; Best Wiki Paper Award

With contributions by: Daniel Mietchen, Junkie.dolphin, Jodi.a.schneider, Adler.fa, OrenBochman, DarTar, Benjamin Mako Hill and Tbayer

Contents

Modeling social dynamics in a collaborative environment

A draft of a letter, submitted for publication, has been posted on ArXiv.[1] The letter reports research on modeling the process of collaborative editing in Wikipedia and similar open-collaboration writing projects. The work builds on previous research by some of its authors on conflict detection in Wikipedia. The authors explore a simple agent-based model of opinion dynamics, in which editors influence each other either by direct communication or by successively editing a shared medium, such as a Wikipedia page. According to the authors, the model, although highly idealized, exhibits a rich behavior that can reproduce, albeit only qualitatively, some key characteristics of conflicts over real-world Wikipedia pages. The authors show that, for a fixed editorial pool with one “mainstream” and two opposing “extremist” groups, consensus is always reached. However, depending on the values of the model’s input parameters, achieving consensus may take an extremely long time, and the consensus does not always conform to the initial mainstream view. In the case of a dynamic group, where new editors replace existing ones, consensus may be achieved through a phase of conflict, depending on the rate of new editors joining the editorial pool and on the degree of controversy over the article’s topic.

How Wikipedia articles benefit from the availability of public domain resources

(more…)

Wikimedia Research Newsletter, June 2012

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png


Vol: 2 • Issue: 6 • June 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Edit war patterns, deleters vs. the 1%, never used cleanup tags, authorship inequality, higher quality from central users, and mapping the wikimediasphere

With contributions by: Tbayer, Piotrus, Evan and Daniel Mietchen

Contents

Dynamics of edit wars

Controversy about Michael Jackson as quantified on the basis of reverted edits to his Wikipedia article. A: Jackson is acquitted on all counts after five month trial. B: Jackson makes his first public appearance since the trial to accept eight records from the Guinness World Records in London, including Most Successful Entertainer of All Time. C: Jackson issues Thriller 25. D: Jackson dies in LA.

“Dynamics of Conflicts in Wikipedia”[1], develops an interesting “measure of controversiality”, something that might be of interest to editors at large if it was a more widely popularized and dynamically updated statistic. The authors look at the patterns of edit warring on Wikipedia articles, finding that edit warriors are usually prone to reaching consensus, and the rare cases of never-ending warring involve those that continuously attract new editors who have not yet joined the consensus.

Regarding methodology, the authors’ decision to filter out articles with under 100 edits as “evidently conflict-free” is a bit problematic, as articles with fewer than 100 edits have been subject to clear, if not over-long, edit warring (a recent example: Concerns and controversies related to UEFA Euro 2012). One could also wish that the discussion of the “memory effects” – a term mentioned only in the abstract and lead, which the author suggests is significant to understanding the conflict dynamic – was explained somewhere in the article (the term “memory” itself appears four times in the body and does not seem to be operationalized anywhere).

A press release accompanying the paper is titled “Wikipedia ‘edit wars’ show dynamics of conflict emergence and resolution“, while an MSNBC tech news headline summarized it as “Wikipedia is editorial warzone, says study“.

Who deletes Wikipedia

(more…)

Wikimedia Research Newsletter, May 2012

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png


Vol: 2 • Issue: 5 • May 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Supporting interlanguage collaboration; detecting reverts; Wikipedia’s discourse, semantic and leadership networks, and Google’s Knowledge Graph

With contributions by: Jodi.a.schneider, Piotrus, Tbayer and Angelika Adam

Contents

Discourse on Wikipedia sometimes irrational and manipulative, but still emancipating, democratic and productive

An article[1] in sociology journal The Information Society looks at interactions between Wikipedia editors and the project’s governance, visible in the articles on stem cells and transhumanism, and in the analysis of Wikipedia’s discussion of userboxes, all through the prism of Jürgen Habermas universal pragmatics and Mikhail Bakhtin dialogism theories.

The authors focus on the qualitative analysis of language used by editors, to argue that Wikipedia has elements of a democracy, and is an example of a Web 2.0–empowering discourse tool. They stress that some forms of discourse found online (including on Wikipedia) may be highly irrational, something that some previous arguments that Web 2.0 is a democratic space have often ignored, but they argue that this is in fact not as much of a hindrance as previously expected. Cimini and Burr remark that discourse can develop between Wikipedians of widely differing points of view, and that some editors will engage in “repeated, strategic, and often highly manipulative attempts” to assert personal authority. Such discussions may be very lively, involving “personal, emotional, or humour-based arguments”, yet the authors argue that such comments may not be a hindrance; instead, “on many occasions, there is thus a clearer exposition of views that is achieved, in spite of, or perhaps because of, these personal [and] sometimes vulgar methods of argumentation.”

In the end, the authors are positive about the success of Wikipedia’s deliberation in reaching consensus, although they say that it can be “fleeting and transitory” on occasion. Unfortunately, the paper does not touch on Wikipedia policies such as Wikipedia:Civility and Wikipedia:No personal attacks, which would certainly have added to their analysis.

(more…)

Wikimedia Research Newsletter, April 2012

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png


Vol: 2 • Issue: 4 • April 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Barnstars work; Wiktionary assessed; cleanup tags counted; finding expert admins; discussion peaks; Wikipedia citations in academic publications; and more

With contributions by: Lambiam, Piotrus, Jodi.a.schneider, Amir E. Aharoni, DarTar, Tbayer, Steven Walling, Junkie.dolphin and Protonk

Contents

Recognition may sustain user participation

The relative number of edits by Wikipedians who had randomly received barnstars (red) and by the control group whose members hadn’t (blue).

To gain insight in what makes Wikipedia tick, two researchers from the Sociology Department at Stony Brook University conducted an experiment with barnstars.[1] They were surprised by what they found.

Professor Arnout van de Rijt and graduate student Michael Restivo wanted to test the hypothesis according to which receiving recognition for one’s work in an informal peer-based environment such as Wikipedia has a positive effect on productivity. To test their hypothesis, they determined the top 1% most productive English Wikipedia users among the currently active editors who had yet to receive their first barnstar. From that group they took a random sample of 200 users. Then they randomly split the sample into an experimental group and a control group, each consisting of 100 users. They awarded a barnstar to each user in the experimental group; the users in the control group were not given a barnstar. The researchers found their hypothesis confirmed: the productivity of the users in the experimental group was significantly higher than that of the control group. What really took the researchers by surprise was how long-lasting the effect was. They followed the two groups for 90 days, observing that the increase in contribution level for the group of barnstar recipients persisted, almost unabated, for the full observation period.

(more…)

Wikimedia Research Newsletter, March 2012

Wikimedia Research Newsletter

Vol: 2 • Issue: 3 • March 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Predicting admin elections by editor status and similarity; flagged revision debates in multiple languages; Wikipedia literature reviewed

With contributions by: Tbayer, DarTar, Jodi.a.schneider, Njullien and Piotrus

Contents

How editors evaluate each other: effects of status and similarity

A team of social computing researchers based at Stanford and Cornell University studied how users evaluate each other in social media.[1] Their paper, presented at the 5th ACM Web Search and Data Mining Conference (WSDM ’12), focuses on three main case studies: Wikipedia, StackOverflow and Epinions. User-to-user evaluations, the authors note, are jointly influenced by the properties of the evaluator and the target; as a result, differences in properties between the target and the evaluator should be expected to affect the evaluation. The study looks specifically at how differences in topic expertise and status affect peer evaluations. The Wikipedia case focuses on requests for adminship (RfAs), the most prominent example of peer evaluation in Wikipedia and a topic that has attracted considerable attention in the literature (Signpost research coverage: September 2011, October 2011, January 2012). Similarity is measured based on article co-authorship, and status as a function of an editor’s number of contributions. Previous research by the same authors showed that the probability an evaluator will evaluate a target user positively drops dramatically when the status of the two users is very similar, and there is general evidence that homophily and similarity in editing activity have a strong influence on peer evaluation in RfAs. The study identifies two effects that jointly account for this singular finding:

  • “Elite” or high-status users are more likely to participate in evaluations about other users who are active in their areas of interest or expertise.
  • Low-status users tend to be judged differently than those with moderate or high status

In a direct application of these results, dubbed ballot-blind prediction, the authors show how the outcome of an RfA can be accurately predicted by a model that simply considers the first few participants in a discussion and their attributes, without looking at their actual evaluations of the target.

Sociological analysis of debates about flagged revisions in the English, German and French Wikipedias

Icon for acceptedFlaggedRevs-1-1.svg

At the center of debates on “Coercion or empowerment”: Icons signifying accepted (left) and not yet accepted (right) revisions under a flagged revisions scheme

In an article to appear in Ethics and Information Technology, Paul B. de Laat analysed debates occurring in the English, German and French Wikipedias about the evolution of the rules governing new edits.[2] As noted by the analysis of the English Wikipedia’s rules, by Butler et al., 2008,[3] these rules are numerous and have increased in number and complexity; they range from the more formal and explicit (intellectual property rights) to the more informal.

(more…)

Wikimedia Research Newsletter completes first volume, introduces new features

Download the complete Volume 1 (PDF)

The success of Wikipedia continues to attract an enormous amount of attention from researchers who are trying to understand what made this one of the most remarkable collaboration projects in history, and unearth valuable insights that may help to improve it. The monthly Wikimedia Research Newsletter launched in mid-2011 – shortly after the announcement of the Wikimedia Research Index – with the aim of covering recent academic research about Wikipedia and other Wikimedia projects. Published jointly by the Wikimedia Research Committee and the Signpost (the English Wikipedia’s community-edited newspaper), it has established itself as a comprehensive outlet enabling both researchers and Wikipedians to stay on top of current research, aiming to facilitate exchange between these two communities.

The six issues published in the first volume (July-December 2011) featured 87 unique references (93 citations) and attracted altogether more than 17,000 pageviews in 2011, not counting the WMF blog edition. The complete Volume 1 is now available as a downloadable 45-page PDF, and a print version can be ordered from Pediapress. The full list of publications reviewed or covered in the Newsletter in 2011 can be browsed online or downloaded (as a BibTeX, RIS, PDF file or in other formats), ready to be imported into reference managers or other bodies of wiki research literature. Open access papers in this collection have been marked with a special open_access tag in the reference list and with an OA icon OA in the body of each issue.

We are also happy to announce the launch of @WikiResearch: a news feed on Twitter and Identi.ca, covering new preprints, papers or research-related blog posts, before they are reviewed more fully in the Newsletter.

Follow @WikiResearch for fresh Wikimedia research news

What’s more, the Newsletter is now also available in the form of an HTML email newsletter (in addition to the announcements of each new issue on the Wikiresearch-l mailing list, which only contain the table of contents). Sign up here to receive a copy of each new issue in your inbox as soon as it comes out.

The Newsletter is a collaborative effort and would not exist without those who have contributed reviews and summaries so far: Boghog, DarTar, Drdee, Hfordsa, Jodi.a.schneider, Junkie.dolphin, Lilaroja, Mietchen, Phoebe, Piotrus, Romanesco, Steven Walling, Tbayer. We are also grateful for the help of several Signpost collaborators in copyediting and preparing the final publication every month.

Finally, thanks to everyone for reading the Wikimedia Research Newsletter, and please
consider contributing by pointing us to new research we should cover, or by volunteering to review new publications.

The editors of the Wikimedia Research Newsletter:

Dario Taraborelli, Senior Research Analyst, Strategy
Tilman Bayer, Movement Communications

Wikimedia Research Newsletter, February 2012

Wikimedia Research Newsletter

Vol: 2 • Issue: 2 • February 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Gender gap and conflict aversion; collaboration on breaking news; effects of leadership on participation; legacy of Public Policy Initiative

With contributions by: Tbayer, Piotrus, Jodi.a.schneider, Hfordsa and DarTar

Contents

Wikipedia research at CSCW 2012

The annual 15th ACM conference on computer-supported cooperative work (CSCW 2012) featured two sessions about Wikipedia Studies. The first one was titled “Scaling our Everest” (in amusing contrast to an earlier metaphor for the role of Wikipedia in that field of research: “the fruit fly of social software”), and covered four papers. A second session likewise comprised four papers and notes. Below are some of the highlights from these two sessions.

Gender gap connected to conflict aversion and lower confidence among women

The Gender Gap hub on Meta.

Since January 2011, Wikipedia’s “Gender gap” has received much attention from Wikimedians, researchers and the media – triggered by a New York Times article that cited the estimate that only 12.64% of Wikipedia contributors are female. That figure came from the 2010 UNU-MERIT study, which was based on the first global, general survey of Wikipedia users, conducted in 2008 with 176,192 respondents using a methodology that had raised some questions (e.g. about sample bias and selection bias), but other studies found similarly low ratios. A new paper titled “Conflict, Confidence, or Criticism: An Empirical Examination of the Gender Gap in Wikipedia”[1] has now delved further into the data of the UNU-MERIT study, examining the responses to questions such as “Why don’t you contribute to Wikipedia?” and “Why did you stop contributing to Wikipedia?”, finding strong support for the following three hypotheses:

(more…)

Wikimedia Research Newsletter, January 2012

WRN header.png

Vol: 2 • Issue: 1 • January 2012 [archives] Syndicate the Wikimedia Research Newsletter feed

Language analyses examine power structure and political slant; Wikipedia compared to commercial databases

With contributions by: Tbayer and Piotrus

Contents

Admins influence the language of non-admins

An Arxiv preprint titled “Echoes of power: Language effects and power differences in social interaction”[1] looks at the language used by Wikipedia editors. The authors look at how conversational language can be used to understand power relationships. The research analyzes how much one adapts their language to the language of others involved in a discussion (the process of language coordination). The findings indicate that the more such adoption occurs, the more deferential one is. The authors find that editors on Wikipedia tend to coordinate (language-wise) more with the administrators than with non-administrators. Furthermore, the study suggests that one’s ability to coordinate language has an impact on one’s chances to become an administrator: the admin-candidates who do more language coordination have a higher chance of becoming an administrator than those who don’t change their language. Once a person is elected an administrator, they tend to coordinate less.

A blog post on the website of Technology Review summarized the results using the headline “Algorithm Measures Human Pecking Order” and highlighted the fact that one of the authors is Jon Kleinberg, known as inventor of the HITS algorithm (also known as “hubs and authorities”).

Can Wikipedia replace commercial biography databases?

California State University, East Bay: Could it rely on biographical information from Wikipedia and the web alone?

An article[2] by a librarian and professor at California State University offers a comparison of “biographical content for literary authors writing in English” between Wikipedia, “the web” (i.e. top Google search results) and two commercial databases: the Biography Reference Bank (BRB, now part of EBSCO Industries) and Contemporary Authors Online, motivated by the decision of the author’s institution to cancel its subscription to the latter database (CAO) during a budget crisis in 2008-2009, which among other reasons had been accompanied by “a comment that this information is ‘on the web’”.

The paper starts out with a literature review on the reliability of Wikipedia and then describes how the author compiled a list of 500 authors (mostly from the US and UK) by “examining curricula and textbooks from English literature courses across the USA” and soliciting additional suggestions from peers. These names were then searched on BRB, CAO (as part of the Literature Resource Center), Wikipedia and Google.

(more…)

Wikimedia Research Newsletter, December 2011

WRN header.png

Vol: 1 • Issue: 6 • December 2011 [archives] Syndicate the Wikimedia Research Newsletter feed

Psychiatrists: Wikipedia better than Britannica; spell-checking Wikipedia; Wikipedians smart but fun; structured biological data

With contributions by: Tbayer, DarTar and Jodi.a.schneider

Contents

Mental health information on Wikipedia more accurate than Britannica and Kaplan & Sadock psychiatry textbook

Wikipedia articles on schizophrenia and other mental health topics were assessed for accuracy, richness of references and readability.

In an article for Psychological Medicine,[1] ten researchers from the University of Melbourne conclude that “the quality of information on depression and schizophrenia on Wikipedia is generally as good as, or better than, that provided by centrally controlled websites, Encyclopaedia Britannica and a psychiatry textbook.”

The study focused on ten mental health topics (e.g. “antidepressants and suicide in young people” or “side-effects of antipsychotics”), five each in the areas of depression and schizophrenia. “Using the topic terms (or synonyms) as key words for the searches or through manual browsing, content relating to these topics was extracted from [Wikipedia and 13 other websites selected for prominent Google results for depression and schizophrenia] and from the most recent edition of Kaplan & Sadock’s Comprehensive Textbook of Psychiatry … and the online version of Encyclopaedia Britannica” by two reviewers. For both depression and schizophrenia, three psychologists with clinical and research expertise in that area evaluated these extracts on accuracy, up-to-dateness, breadth of coverage, referencing and readability, on a scale from 1 to 5 (“e.g. Accuracy: 1 = many errors of fact or unsubstantiated opinions, 3=some errors of fact or unsubstantiated opinions, 5 = all information factually accurate”). As in an earlier study of the quality of health information on Wikipedia (Signpost coverage: “Wikipedia’s cancer coverage is reliable and thorough, but not very readable“), readability was also measured using a Flesch–Kincaid readability test, which is calculated from word and sentence lengths.

For both depression and schizophrenia, Wikipedia scored highest in the accuracy, up-to-dateness, and references categories – surpassing all other resources, including WebMD, NIMH, the Mayo Clinic and Britannica online. In breadth of coverage, it was behind Kaplan & Saddock and others for both areas. And “of the online resources, Wikipedia was rated the least readable [by the human reviewers], although some of its topics received an average rating.” Likewise, the Wikipedia content had relatively high Flesch–Kincaid Grade Level indices (around 16 for schizophrenia and 15 for depression – indicating that a tertiary level of education is necessary to understand the content), similar to that of Britannica but higher than most other resources examined.

The authors note that their “findings largely parallel those of other recent studies of the quality of health information on Wikipedia” (citing eight such studies published between 2007 and 2010):

“Despite variability in the methodologies and conclusions of these studies, the overall implication is that Wikipedia articles on health topics typically contain relatively few factual errors, although they may lack breadth of coverage. … Given the number of patients, would-be patients and concerned others using the internet to search for information on health issues, it seems that Wikipedia is an appropriate recommendation as an information source.

Psychologists gauge impact of Wikipedia’s Rorschach test coverage

(more…)