How many women edit Wikipedia?

Translate This Post
Edit-a-thon in Banff, Canada. Photo by ABsCatLib, under CC BY-SA 4.0
Women edit Wikipedia together at an Arts + Feminism Edit-a-thon in Banff, Canada. Photo by ABsCatLib, under CC BY-SA 4.0

The month-long “Inspire” campaign seeking ideas for new initiatives to increase gender diversity on Wikipedia recently concluded successfully, with hundreds of new ideas and over 40 proposals entering consideration for funding.

During this campaign, there were a lot of questions about the empirical basis for the statement that women are underrepresented among Wikipedia editors, and in particular about the estimate given in the campaign’s invitation banners (which stated that less than 20% of contributors are female).

This blog post gives an overview of the existing research on this question, and also includes new results from the most recent general Wikipedia editor survey.

The Wikimedia Foundation conducted four general user surveys that shed light on this issue, in 2008 (in partnership with academic researchers from UNU-MERIT), 2011 (twice) and 2012. These four large surveys, as well as some others mentioned below, share the same basic approach: Wikipedia editors are shown a survey invitation on the site, and volunteer to follow the link to fill out a web-based survey. This has been a successful and widely used method. But there are some general caveats about the data collected through such voluntary web surveys:

  • Percentages cannot be compared, due to different survey populations: The overall percentage among respondents from one survey (e.g. the frequently cited 9% from the December 2011 WMF editor survey, or the 13% from the 2008 WMF/UNU-MERIT survey) is often taken as a rough proxy of “the” gender ratio among Wikipedia contributors overall. But different surveys cover different populations, e.g. because they were not available in the same set of languages, or because the definition of who counts as “editor” varies. This is especially relevant when trying to understand how the gender gap develops over time – e.g. we can’t talk about a “drop” from 13% to 9% between the 2008 and April 2011 surveys, because their populations are not comparable. Also, the slightly higher overall percentage in the 2012 survey, compared to the preceding one (see below) should not be interpreted as a rise. However, comparisons are possible for comparable populations, and in this post we present such trend statements for the first time.
  • Participation bias between languages: There is evidence that the participation rates for such surveys vary greatly between editors from different languages. For example, in both the 2008 survey and the 2012 survey, the number of Russian-language participants was much higher than for other languages, compared to the number of active editors in each language.
  • Women editors may be less likely to participate in surveys: A 2013 research paper by Benjamin Mako Hill and Aaron Shaw confirmed the longstanding suspicion that female Wikipedians are less likely to participate in such user surveys. They managed to quantify this participation bias in the case of the 2008 UNU-MERIT Wikipedia user survey, correcting the above mentioned 13% to 16%, and arriving at an estimate of 22.7% female editors in the US (more than a quarter higher than among US respondents in that survey). Hence we now know that the percentages given below are likely to be several percent lower than the real female ratio.
  • Different definitions of “editor”: Most of these surveys have focused on logged-in users, but there are also many people contributing as anonymous (IP) editors without logging into an account. What’s more, many users create accounts without ever editing (for this reason, the 2011/12 editor surveys contained a question on whether the respondent had ever edited Wikipedia, and excluded those who said “no”. Without this restriction, female percentages are somewhat higher).
  • Because they only reach users who visit the site during the time of the survey, these surveys target active users only. And depending on methodology, users with higher edit frequency (which, as some evidence suggests, are more likely to be male) may be more likely to participate as respondents.
  • Sample size: As usual with surveys, the fact that respondents form only part of the surveyed population gives rise to a degree of statistical uncertainty about the measured percentage, which can be quantified in form of a confidence interval.

Still, these caveats do not change the fact that the results from these web-based surveys remain the best data we have on the problem. And the overall conclusion remains intact that Wikipedia’s editing community has a large gender gap.

What follows is a list of past surveys, briefly summarizing the targeted population and stating the percentage of respondents who responded to the question about their gender with female in each. In each case, please refer to the linked documentation for further context and caveats. Keep in mind that the stated percentages have not been corrected for the aforementioned participation bias, i.e. that it is likely that many of them are several percent too low, per Hill’s and Shaw’s result.

General user surveys

(As detailed above, please be aware that the percentages from different surveys are not necessarily comparable, and are likely to be several percent lower than the real female ratio.)

2012 Editor Survey

  • Population: Logged-in Wikipedia users who did not respond “no” to the question “Have you EVER edited Wikipedia?”
  • Method: Banners in 17 languages, shown only once per user (October/November 2012)
  • 10% female (n=8,716. 11% when including non-editors and users who took the survey on Wikimedia Commons. 14% among Commons users, with n=463)

December 2011 Editor Survey

  • Population: Logged-in Wikipedia users who did not respond “no” to the question “Have you EVER edited Wikipedia?”
  • Method: Banners in multiple languages, shown only once per user
  • 9% female (n=6,503)

April 2011 Editor Survey

  • Population: Logged-in Wikipedia users who did not say they had only made 0 edits so far
  • Method: Banners in 22 languages, shown only once per user
  • 9% female (n=4,930)

UNU-MERIT/WMF survey (2008)

  • Population: Site visitors who described themselves as “Occasional Contributor” or “Regular Contributor”
  • Method: Banners shown to both logged-in and logged-out users, in multiple languages
  • 13% female (n=53,888)

Other surveys

There have also been several surveys with a more limited focus, for example:

Global South User Survey (WMF, 2014)

  • Population: Site visitors in 11 countries and 16 languages, who selected “Wikipedia” (along other large websites) in response to the question “Which accounts do you most frequently use”?
  • Method: Banners shown to both logged-in and logged-out users
  • 20% female (n=10,061)

Note: In this survey, the ratio of female editors was much higher than in the 2011 and 2012 surveys, in those countries where data is available. However, it is plausible that this difference can largely be attributed to different methodologies rather than an actual rise of female participation across the Global South.

Gender micro-survey (WMF, 2013)

  • Population: Newly registered users on English Wikipedia
  • Method: Overlay prompt immediately after registration
  • Draft results: 22% female (n=32,199. 25% when not counting “Prefer not to say” responses)

JASIS paper on anonymity (2012)

  • Population: Active editors on English Wikipedia (estimated to number 146,208 users at the time of the survey (2012))
  • Method: User talk page messages sent to a random sample of 250 users
  • 9% female (n=106)
Tsikerdekis, M. (2013), The effects of perceived anonymity and anonymity states on conformity and groupthink in online communities: A Wikipedia study. J. Am. Soc. Inf. Sci.. DOI:10.1002/asi.22795 (preprint, corresponding to published version)

Grassroots Survey” (Wikimedia Nederland, 2012)

  • Population: Members of the Dutch Wikimedia chapter and logged-in users on the Dutch Wikipedia
  • Method: Banner on Dutch Wikipedia, and letters mailed to chapter members
  • 6% female (n=1,089 (completed))

Wikibooks survey (2009/2010)

  • Population: Wikibookians in English and Arabic
  • Method: Project mailing list postings and sitenotice banners
  • 26% female (of 262 respondents, 88% of which described themselves as contributors)
Hanna, A. 2014, ‘How to motivate formal students and informal learners to participate in Open Content Educational Resources (OCER)’, International Journal of Research in Open Educational Resources, vol. 1, no. 1, pp. 1-15, PDF

Wikipedia Editor Satisfaction Survey (Wikimedia Deutschland with support from WMF, 2009)

  • Population: Logged-in and anonymous editors on German and English Wikipedia
  • Method: Centralnotice banner displayed after the user’s first edit on that day, for 15 minutes (all users on dewiki, 1:10 sampled on enwiki)
  • 9% female (ca. 2100 respondents – ca. 1600 on dewiki, ca. 500 on enwiki)
Merz, Manuel (2011): Understanding Editor Satisfaction and Commitment. First impressions of the Wikipedia Editor Satisfaction Survey. Wikimania 2011, Haifa, Israel, 4-7 August 2011 PDF (archived)

“What motivates Wikipedians?” (ca. 2006)

  • Population: English Wikipedia editors
  • Method: Emailed 370 users listed on the (hand-curated, voluntary, since deleted) “Alphabetical List of Wikipedians”, inviting them to fill out a web survey
  • 7.3% female (n=151)
Nov, Oded (2007). “What Motivates Wikipedians?”. Communications of the ACM 50 (11): 60–64. DOI:10.1145/1297797.1297798, also available here

“Wikipedians, and Why They Do It” (University of Würzburg, 2005)

  • Population: Contributors to the German Wikipedia
  • Method: Survey invitation sent to the German Wikipedia mailing list (Wikide-l) (“The sample characteristics of the present study might be [a] limitation because participants were very involved in Wikipedia … the reported results might not be the same for occasional contributors to Wikipedia.”)
  • 10% female (n=106)

Trend analysis: How the gender gap changed during 2012

As mentioned above, one can’t meaningfully compare the overall percentages of these two surveys, as they covered different populations. However, if we only look at editors from a particular country, we have two comparable populations. Here is the trend data per country from the two most recent general editor surveys:

2012 editor survey Dec 2011 editor survey Change Significant change?
Country %female n % female n (Dec’11 to Oct/Nov’12) (2-tailed z-test, p = 0.05)
US
17.0%
1368
13.6%
847
+3.4%
significant
Germany
8.6%
1017
8.3%
866
+0.2%
not significant
France
9.3%
707
11.5%
407
-2.2%
not significant
Russia
11.1%
559
7.4%
651
+3.7%
significant
India
3.1%
255
3.3%
121
-0.2%
not significant
UK
9.2%
425
8.6%
278
+0.5%
not significant
Italy
11.6%
398
20.2%
431
-8.6%
significant
Japan
6.8%
351
6.1%
231
+0.8%
not significant
China
4.2%
167
Canada
12.0%
242
7.2%
139
+4.8%
not significant
Poland
7.8%
206
5.3%
263
+2.4%
not significant
Ukraine
9.5%
201
Australia
13.0%
177
Spain
8.6%
186
4.0%
177
+4.6%
significant
Netherlands
7.4%
136
Brasil
3.8%
105
7.1%
140
-3.3%
not significant
Israel
15.0%
127
8.9%
123
+6.0%
not significant
Sweden
13.5%
111
Argentina
13.7%
102

(Only showing countries where more than 100 respondents stated their gender. See here and here for the survey instruments. A fuller report on the 2012 survey with more detail on the methodology will be released soon.)

Overall, there is no evidence that the general problem got more or less severe during that year, but the fact that several countries saw statistically significant changes indicates that the gender gap is not immutable. (It should be mentioned that during 2012 – i.e. for the time span between these two surveys – the Wikimedia Foundation supported the work of a US-based community fellow to encourage participation of women in Wikimedia projects. There isn’t enough data to assert a causal connection with the 3.4% rise in the US during this time, but it’s an encouraging data point nevertheless. The success of our current “Inspire” campaign will be measured by incremental numbers on female participation on a per-project basis, among other metrics, rather than trying to attribute changes in overall percentages to specific activities.)

Other data sources about the size of the gender gap

Besides surveys where editors are being asked directly about their gender, some community members and researchers have examined how users voluntarily publish their gender via:

While this can produce some interesting results, it is important to be aware of the limitations of these approaches when used to estimate the overall ratio of female users (see e. g. section 3.2 “Assumptions and Limitations” in the 2011 “WP:Clubhouse” paper by Lam et al., which uses a combination of them). As opposed to many other sites (e.g. Facebook), the gender information in the user preferences is optional; the setting is somewhat hidden, and the majority of accounts do not use it. There a good reasons to assume that the differing incentives distort that data even more than the anonymous responses to banner-advertised surveys. For example, the user has to be comfortable with stating their gender in public, and in several languages female users have to set that user preference if they want system messages to address them in the correct gender – e.g. the word “user” next to their nick show up in female instead of male grammatical gender form (such as “Benutzerin” vs. “Benutzer” in German). Male users do not have that incentive.

Other research about the gender gap

This post does not cover some arguably more important questions about the gender gap, e.g.:

  • What factors contribute to the gender gap, and what can we do to mitigate them?
  • What effect does the gender gap among contributors have on Wikipedia’s content?

For further research on these and other questions, see e.g. the “Address the gender gap” FAQ on Meta-wiki, or follow our monthly newsletter about recent academic research on Wikipedia.

Tilman Bayer, Senior Analyst, Wikimedia Foundation

Many thanks to Aaron Shaw, Alex “Skud” Bayley and Siko Bouterse for reviewing drafts of this post (all errors remain the author’s own).

2015-05-08: Edited to add the sample size for the 2008 UNU-MERIT/WMF survey

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

12 Comments
Inline Feedbacks
View all comments

I don’t understand: why do you say that comparing statistically non-representative results on a country basis suddenly makes them statistically representative? Participation bias can go in any direction and vary, the survey changed from 2011 to 2012, we can’t know where it was advertised (if proportionally to women and men or not) and how that varied from 2011 to 2012, etc. etc.
The fact that percentages in two countries (Spain and Italy) halved or doubled from one year to the next doesn’t prove anything.

Thanks for the report, an interesting read for Mayday. – Apart from that I just would like to point out that Wikimedia blog won’t print in either Safari nor Firefox under OS X Mavericks. It breaks reproducibly after the first page. You might like to fix the glitch. – Thanks again.

Your project has no chance when the people at the top (Jimmy Wales and Sue Gardner) spout sexist comments on National TV: http://newslines.org/blog/the-sexists-at-the-top-of-wikipedia/

Tilman, thanks. Well, the blog post says it’s a meaningful comparison, while to me it seems quite risky. I understand your points, but how can you say the advertising was the same? If you mean the CentralNotice campaigns were identical (which is not true because the translation changed), did you account for: a) changes in the CentralNotice extension and infrastructure itself, b) conflicting or overlapping local notices (I see there were some in that period on it.wiki), c) variations in the target page (including differences in the l10n of LimeSurvey vs. Qualtrics for Italian, Spanish, Russian), d) advertising for the… Read more »

Heya – {ping}ed you and 2 other “Labcoats”, requesting new data (live data?) to monitor progress on ending (finally) Gender and Systemic biases among Wikipedians and in Wikipedia Articles https://meta.wikimedia.org/wiki/Research_talk:Gender_gap – Thanks & Mabuhay! – Leo

[…] that while opt-in surveys have their limitations, such as a response bias, they’re arguably the best source of data on this […]

[…] shared internally and publicly since, also as part of a blog post addressing the question “How many women edit Wikipedia?“. In this post we are highlighting some further results, and publish the full top-line […]

[…] in the U.S. between 2011 to 2012 in two surveys, that number jumps by a statistically significant 3.4 percent. It’s hardly a trend across the board, but it’s […]

[…] Of the 262 respondents, 88% described themselves as contributors. 26% identified as female and 71.8% male, indicating one of the smallest gender gaps observed on Wikimedia projects. […]

[…] needs more female contributors. Although much effort has been spent in correcting this gender gap, it’s not clear how much impact this is having on the number of women editors. To recruit women more efficiently, one must look at the underlying […]