Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts by Zack Exley

Intro to the statistics of A/B testing with Wikimedia fundraising banners

The Wikimedia fundraising team relies on A/B testing to increase the efficiency of our fundraising banners. We raise millions of dollars to cover the expense of serving Wikipedia and the other Wikimedia sites. We don’t want to run fundraising banners all year round, we want to run them for as few days as possible. Testing has allowed us to dramatically cut the number of days of banners each year — from 50 to about nine.

We’re in the middle of reevaluating the statistics methods we use to interpret A/B tests. We want to make sure we’re answering this question correctly: When A beats B by x percent in, say, a one-hour test, how do we know that A will keep beating B by x percent if we run it longer? Or, less precisely, is A really the winner? And by how much?

If you’re not familiar with this kind of statistics, thinking about coin tossing can help: If you flip a coin one thousand times, you’re going to get heads about half the time. But what if you flip a coin only 4 times? Often you will get heads 2 times, but you’ll often get heads 1, 3 and 4 times. Four coin flips are not enough to know how often you’ll really get heads in the long run.

In our case, each banner view is like a coin toss: heads is a donation, tails is no donation. But it’s an incredibly lopsided coin. In some countries, and at certain times of day might only get “heads” one in one hundred thousand “flips.” Think about two banners with a difference: one has all bold type and one only has key phrases in bold. Those are like two coins with very slightly different degrees of lopsidedness. Imagine that, over the course of a particular test, one results in donations at a rate of 50 per hundred thousand, and another at a rate of 56 per hundred thousand.

Our question is: How can we be sure that the difference in response rates isn’t due to chance? If our sample is large enough (as when we flipped the coin one thousand times) then we can trust our answer. But how large is large enough?

We run various functions using a statistical programming language called R to answer all these questions. In future posts, if readers are interested, we’ll get into more details. But today, we just wanted to show a few graphs we’ve made to check our own assumptions and understanding.

The graph immediately below is from an exercise we just completed: We went back to one of our largest test samples, where we ran A vs B for a very long time. In fact, it is the example of “all bold” vs “some bold” I just mentioned, and the response rates across a range of low-donation-rate countries were 56 and 50 per 100,000 respectively. We chopped that long test up into 25 smaller samples that are closer to the size of our typical tests — in this case with about 3.5 million banner views per banner per test. Then we checked how often those short tests accurately represented the “true”(er) result of the full test.

In the graph below, you’re looking at the data showing how much A beat B by in each test (each subset of the larger test actually). The red vertical line represents the true(er) value of how much A beats B based on the entire large sample. Each dot represents A’s winning margin in a different test — 1.1 means A beat B by 10 percent. This kind of graph is called a histogram. The bars show how many results fit into different ranges. You can see that most of the tests fall around a central value. This is good to see! Our stats methods assume the data conforms to a certain pattern, which is called a “normal distribution.” And this is one indication that our data is normal.

Test data

Another piece of good news: all of the dots are greater than 1. That means that none of these smaller tests lied about banner A being the winner. What’s sad, though, is how much most of the tests lie about how much A should win by. This isn’t a surprise to us — we know that those ranges are wide — especially when response rates are as low as they were in this test.

One fun thing R can do is generate random data that conforms to certain patterns. The graphs below show what happened when we asked R to make up normally distributed data using the same banner response rates. Compare the fake data graphs below to the real data graph above. First of all, notice how much the three graphs vary below. That’s one simple way of showing that our real data doesn’t need to look exactly like any one particular set of R-generated normal data to be normal.

20130408_randTests2 20130408_randTests3 Randomly generated test data 1
Even so, can we trust that our data is normally distributed? We think so, but we have some questions. Our response rates vary dramatically over the course of a 24-hour day (high in the day, low at night). Does that create problems for applying these statistical techniques? In this particular test, the response rate varies wildly from country to country — and there are dozens of countries thrown into this one test. Does that also cause problems? Tentatively, we don’t think so because the thing we’re measuring in the end — the percentage by which A beats B — doesn’t vary wildly by country or time of day…we think. But even if it did, since A is always up against B in the exact same set of countries and times, we think it shouldn’t matter. One little (or maybe big?) sign of hope is that the range of our real data approximately matches the ranges of the randomly generated normal data.

But those are a few of the assumptions we’re working to check. We’re always reaching out to people who can help us with our stats. We’re looking for people who are Phd level math or stats people who have direct experience with A/B testing or some kind of similar response phenomenon. Email fundraising@wikimedia.org with “Stats” in the subject line if you think you might be able to help, or know someone.

Zack Exley, Chief Revenue Officer, Wikimedia Foundation and Sahar Massachi

All Our Ideas in the Wikimedia fundraiser

The Wikimedia fundraiser is facilitated by two things: Banners and appeals. The banners appear at the top of the site, featuring a picture of someone from the Wikimedia movement (Jimmy, our founder, an editor, reader, or donor), and the words, “Please read: A personal appeal from Wikimedia (Founder|Editor|Reader) So and So.”

Clicking the banner lands you on a donation form featuring a letter from the person in the banner. A lot of fundraising experts have told us this is a dumb way to fundraise. They say people don’t read the appeals, and that surely there’s something better we could run in the banners other than “Please read a personal appeal.”

We’ve tested the appeal pages against simple donation forms with no appeals, with basic facts, and slogans, and nothing has performed better than the appeals. We’re happy about that, because we love that the fundraiser serves a double purpose of educating our 470 million readers about how Wikipedia and the Wikimedia movement work.

But we’re unhappy that we haven’t been able to find anything better than “Please read a personal appeal” for our banners. It’s not for lack of trying. We’ve tested more than 100 different banner phrases. And we’ve tested a few non-human images (e.g. hands holding the Wikipedia globe logo).

Only one banner has occasionally beaten “Please read a personal appeal,” and that is: “If everyone reading this donated $5, we could end the fundraiser today.” But that banner seems to set the expectation that the fundraiser is about to end soon, so we only like to use that at the end of the campaign.

Last year, we asked the Wikimedia community to suggest banners and tested many of them. None came close to beating “personal appeal.” This year, though, thanks to a tool created by friends at Princeton University, we have a new way to revisit those ideas, and bring in some new ones, for testing.

Professor Salganik and his research group are the developers of All Our Ideas, an open source platform for public participation. It enables groups to collect and prioritize information in a way that is democratic, transparent, and efficient, and it has already been used by governments and non-profit organizations around the world.

He approached us about using this tool for choosing new banners to test and we said we would like to try it. You can go there now and start voting on banners at:

http://www.allourideas.org/wikipedia-banner-challenge

We’ll be watching the results and will test the ones that come out on top in the voting. We’ve helped to seed the tool with banners proposed by the community last year. We were not able to test all of the ideas suggested then. We will test at least a handful of the ones that come out on top in this voting process that haven’t been tested before — as long as they are in line with the spirit and values of the Wikimedia movement.

There is also a way to propose new ideas — and new images — for banners using the All Our Ideas tool.

Finally, one thing I should explain is why we’re looking for a better banner. Each year, we only raise what we need and then end the fundraiser. If a better banner brings double the number of donors from our best current banner, then we can cut the duration of the fundraiser in half — and that would be a very good thing.

Fundraise differently

What I love about the annual Wikimedia Foundation Fundraiser is that it proves all the cynics wrong. By traffic, we’re the #5 web property in the world, serving 422 million people last month — 12 billion times. But we are funded entirely by voluntary donations. No government grants, no corporate sponsors, no ads. Each year, we ask our readers to pitch in “$5, $10 or $20 to keep Wikipedia free” and, so far, they’ve always met the need. When we reach the goal, we stop asking.I wish more top websites did this. So many new kinds of sites have woven themselves into our lives and communities, isn’t it sad that they must work to expose us to as much advertising as possible and sell our personal information to stay in business? Why is it that only one top website, Wikipedia, is supported directly and voluntarily by its users? I would love to see our funding model become an realistic option for future start ups hoping to embed themselves in daily life.This blog is where the Wikimedia Fundraising team and our friends and collaborators will discuss how we actually run the fundraiser. Every day brings new — often surprising — lessons that will be of interest to fundraising professionals and marketers. Another thing I love about our fundraiser is how frequently it overturns various marketing and fundraising dogmas and uncovers exceptions to iron laws of human nature preached by pop social science. Maybe Wikimedia donors are just different. Or maybe the approaches we’re free to experiment with in our unique context are revealing that there’s a better way to engage one’s audience.

Our fundraiser officially begins in November, but we’ve been testing new approaches each week for the past few months — form designs, banners, and new appeals from Wikimedia volunteers and staff. We’ve also been having great adventures in revamping our open source technical infrastructure and preparing to accept hopefully hundreds of new global payment methods and currencies.

We’ve got a lot to report on already. Please stay tuned and help us succeed this year with comments, suggestions and your own experiences in fundraising.

Zack Exley,
Chief Community Officer, Wikimedia Foundation

Announcing the Community Department Summer of Research

It is my pleasure to announce that, starting today and continuing through August this year, the Community Department at the Wikimedia Foundation will be hosting eight talented academics from around the world for a summer research program.

This initiative brings together those working in a variety of fields — from Computer Science to Rhetoric and Language Studies — for an interdisciplinary look at Wikipedia communities. We’ve also made an effort to bring aboard those who have previously chosen Wikimedia projects as a research topic in their academic work. Led by Diederik van Liere, Maryana Pinchuk, and Steven Walling, the group will be concentrating on a wide-ranging set of questions that address vital issues of openness and participation in Wikipedia.

Please welcome all our summer researchers:

  • R. Stuart Geiger is a PhD candidate, UC Berkeley School of Information, focusing on knowledge production in distributed and decentralized environments — specifically Wikipedia and scientific research networks. He has been a Wikipedia editor since 2004 has been studying the project as an ethnographer since 2007. His current research explores the relationship between technical infrastructures and social structures, and he has written on bots, vandal fighting, administration, and the history of Wikipedia.
  • Aaron Halfaker is a PhD candidate of Computer Science at the University of Minnesota, GroupLens Research, focusing on Computer-mediated human interaction. Aaron started editing Wikipedia four years ago and quickly found his niche creating user scripts to find ways of improving the collaborative experience. His research explores mechanisms for motivating and supporting volunteer collaboration.
  • Fabian Kaelin is a Master of Science candidate from McGill University, focused on machine learning.
  • Melanie Kill is Assistant Professor of English at the University of Maryland, specializing in digital rhetoric and genre studies. She is currently at work on a book on Wikipedia and the history of the genre of the encyclopedia. She earned her PhD in Rhetoric and Language Studies from University of Washington and previously has taught at Texas Christian University.
  • Giovanni Luca Ciampaglia is a PhD candidate at the University of Lugano in Switzerland. Giovanni is a computer scientist who studies user involvement in commons-based peer production communities, group consensus and collective deliberation processes.
  • Yusuke Matsubara is a PhD student, University of Tokyo (Japan), studying computational linguistics. His research focus is in analysing how people write and read from a computational and empirical point of view. Since 2008, he has been an occasional writer, translator and programmer for Wikimedia.
  • Jonathan Morgan is a PhD candidate, University of Washington, studying social interaction on collaborative online creative environments. As a researcher, he is particularly interested in tracing connections between the things people say (and the way they say them) and their roles, goals and activities online. He also works on the design of tools for improving public deliberation on the web, and on practical tools for internet researchers.
  • Shawn Walker is a PhD candidate at the University of Washington iSchool, and studies digital government and public engagement.

This endeavor presents a unique opportunity; there will be more full-time researchers at the Wikimedia Foundation for these three months than ever before in its history. By gathering a working group with strong qualitative and quantitative skillsets, we hope to produce a rich body of results with both scientific rigor and of great practical use to the Wikimedia movement.

If you’re a Wikipedian who would like to get involved, we’d love to have you participate. We especially seek community members who can pragmatically align our research with the everyday realities of working on the encyclopedia, as well as think about how trends uncovered may be constructively addressed.

For other researchers and developers interested in Wikipedia, we’ll be publishing our code and data (when in line with our privacy policy) under a free and open license so that you can build on it in the future.

If you’re curious about the particulars of the research, our documentation on Meta and the associated discussion pages are the best sources of information. We’ll also be blogging frequently here about our work.


Zack Exley,
Chief Community Officer

New Wikimedia Fellow

I’m pleased to announce Achal Prabhala as our latest Wikimedia Foundation fellow.

Achal is a writer and researcher in Bangalore who has participated as a volunteer in the Wikimedia movement in India and globally for years, and as a member of the Foundation’s advisory board.

Achal will be conducting field research in rural South Africa and India with Wikipedians and non-Wikipedians across three languages to explore ways to compensate for the gap in published/printed sources in many local languages.

From Achal:

Even if every single person in the south with Internet access wanted to become an active editor on Wikipedia, there is still a problem that we are going to run up against. It’s a problem that bedevils everyone working in local languages in Asia and Africa, and it’s something we have no control over: the lack of published scholarly resources in these languages.

For Wikipedias in languages of the south, citations are not difficult to find when the articles being added are translations. However, since we all want the sphere of knowledge to be universally expanded – and not merely transferred from the north to the south – we are forced to confront two specific problems with finding citations for important local subject matter: (i) Published resources may simply not exist. (ii) Even when published scholarly resources exist, they may be limited or inaccessible and thus effectively rendered invisible to Wikipedians.

To put it another way, it’s possible that the sum of published scholarly work from Europe is somewhat close to the sum of ‘European’ knowledge, and that the sum of accessible, published scholarly work in many Asian and African languages is nowhere close to the corresponding body of knowledge that circulates among speakers of those languages.

Despite these problems, Tamil Wikipedia has about 25,000 articles, Malayalam Wikipedia has about 15,000, and Northern Sotho Wikipedia has about 600. In all these language Wikipedias, there are articles – especially when concerning subjects that are specific to a particular people or place where the language is spoken – which lack citations, because there are simply no or not enough published resources to refer to.

The scope of the project is to investigate how one might compensate for the lack of traditional citations; how an alternative means of citation may be constructed; and how this may be feasibly and easily deployed – and improved – by Wikipedians in the future.

I’m looking forward to seeing fascinating and useful results from Achal’s project.

– Zack Exley, Chief Community Officer

Announcing our fourth fellow, Lennart Guldbrandsson

Lennart Guldbrandsson with Wikimedia Foundation's Head of Outreach Frank Schulenburg

Today it is my pleasure to announce that the Wikimedia Foundation has a new fellow, Lennart Guldbrandsson. He is the fourth fellow, after Steven Walling, Victoria Doronina and Maryana Pinchuk. During his fellowship, Lennart will work on two projects: the Bookshelf Project and the Account Creation Improvement Project.

A long-time Wikipedian from Sweden, Lennart has been on our radar for some time now. He founded and became the first chair of the second largest Wikimedia chapters, Wikimedia Sverige, wrote one of the first books about Wikipedia, and has written several of the Wikipedia instructional videos. Lately, we hired him to work with the Bookshelf Project, as he has many years of experience as a writer and writing teacher. After studies at Uppsala and Gothenburg, where he now lives, he has his own writing company. He is active on several wikis under the user name Hannibal.

Lennart’s work with the Bookshelf Project will focus on getting the educational materials translated into many more languages as well as helping chapters and individuals dissemination the materials effectively on a global scale. See some examples of the Bookshelf materials here.

The other part of the fellowship is to make it easier for newcomers to create their accounts on Wikipedia – and facilitate their first steps after they have created the account. A short explanation for why that’s important is here.

The idea is that both these projects should be as engaging and transparent as possible. Lennart will post reports regularly, and try to get your feedback from time to time, and generally be reminding you of the existence of these projects.

For both of these projects, Lennart wants you to know that he will need help. You are welcome to sign up here for the Bookshelf Project or here for Account Creation. Or you can send an email to lennart@wikimedia.org.

- Zack Exley, Community Department

Two New Community Department Fellows

I’m pleased to announce two new Community Fellows: Victoria Doronina and Maryana Pinchuk who are beginning an eight-week project to develop methods for writing histories of Wikimedia projects. The objective of this short project is to experiment in several directions toward developing a more in-depth plan for writing the histories of particular Wikipedias.

We found both Victoria and Maryana through the Community Department “open call.” Maryana is a PhD student in the Department of Slavic Languages and Literature at Harvard University but is currently based in Berkeley, CA, and therefore will be working partly in the San Francisco Wikimedia Office.  In addition to literary history, she is interested in cultural studies and community formation, which were the subjects of her undergraduate honors thesis on the semiotics of the Ukrainian Orange Revolution.  (Dr.) Victoria Doronina is a molecular biologist by training, located in the UK.  She is also an administrator and active editor of the Russian Wikipedia (User:Mstislavl).  Victoria is interested in communicating the practices and lessons of the Russian Wikipedia to other Wikipedia projects. Between them they read eight languages, which will enable them to compare many different Wikimedia projects.

Some attempts have been made to study Wikimedia history, but these studies have tended to focus on the English Wikipedia as their primary model, neglecting the individual historical evolution of other projects and the contextualization of all Wikimedia communities within a real-life geopolitical space.  In order to better understand the issues unique to each project community and to highlight solutions to common problems faced by many, it is necessary to begin experimenting with methods for researching and writing systematic comparative project histories — and make them available to the Wikimedia community at large.

Writing WikiHistory will require the development of new research methods that can grapple with the novel characteristics of wiki-based projects, which are the complex, somewhat chaotic product of anonymous contributors and prolific, highly public online figures alike.  Our Fellows will explore possible avenues for undertaking this kind of research, including the potential suitability of both off-wiki and in-wiki methods.  Some of the questions to be addressed in the primary stage of this project are:  How can the key players, events, and structural features in a Wikipedia be identified and incorporated into a historical narrative?  Is archival information enough to develop a full picture of the community’s history, or is it necessary to reach out to specific contributors?  Can wiki technology be used to create a collaborative Wikipedia history, or does synthesizing historical information and conducting original research contradict the principles of neutrality and verifiability that are fundamental to Wikipedia?  How can the results of these studies best be presented to the community, and what problems can (or can’t) they be expected address?

For this project, we are intentionally pairing a scientist with a literary historian, and a non-Wikimedian with a longtime Wikipedia contributor and functionary. Maryana’s familiarity with combing through archival records, and Victoria’s experience with scientific research methods both feel necessary for this project to succeed — as does Victoria’s intimacy with Wikimedia projects and Maryana’s outsider’s perspective.

Please wish them luck as they undertake this experiment. If you would like to offer help, please let them know in the comments below. They could use some additional support in picking through Wikipedia data dumps.

- Zack Exley, Community Department

Wikimedia Foundation Community Fellowship program

Today I am pleased to announce the Wikimedia Foundation Community Fellows program, a project of the Foundation’s Community Department. This program is partly something the Foundation has been considering for a long time and partly an outgrowth of the recent Community Department open hiring call. In reading through the nearly 2,000 submissions to the open call, we realized that there are far more qualified members of the Wikimedia community than we can ever hire at one time. We also realized that the most promising submissions came from people who were interested in working on specific problems or opportunities in the movement rather than looking for permanent staff positions.

Therefore, the Fellows program is intended to create an environment where individuals and teams can concentrate full-time on important problems and opportunities in Wikimedia projects and the movement as a whole. Fellows will lead intensive, time-limited projects focused on key areas of risk and opportunity — projects that require the support of the Foundation to succeed.

We are currently looking for Fellows among those who have already answered the open hiring call and we will also be posting a form for submitting new Fellowship applications soon. In addition to Wikimedia volunteers, we would like to engage outside academics and professionals who have expertise that could benefit our projects.

Each Fellowship will have its own objectives that require a unique skill set. The length of the Fellowships at the Wikimedia Foundation will vary from weeks to as long as a year, depending on the requirements of the project. A handful of Fellows may eventually join the Foundation staff permanently. However, the purpose of the program is to take charge of vital work that may not require a long-term position but which volunteers have not previously dealt with successfully.

Along with introducing what will be a continuing Wikimedia Foundation program, we’d like to welcome the first Fellow in the Community Department: Steven Walling. Steven represents exactly what we’d hoped to attract through this process: talented Wikimedians who have a knack for crafting clear theories about how Wikimedia communities operate and how they can be supported. Steven’s previous experience as a freelance writer, blogger, and community manager make him an excellent choice to pilot this program. Steven is beginning a year-long Fellowship, during which he will work on multiple projects. He will be blogging about his projects as he continues.

We are deeply excited about the possibilities for the Wikimedia Foundation Fellowship program. Bringing in talented individuals who have specific projects in mind will allow us all to ask questions and solve problems that were previously out of reach for either volunteers or staff. I hope you’ll join us in welcoming Steven and more new Wikimedia Foundation Fellows as they are announced.

Zack Exley,
Chief Community Officer