Data analytics

  1. “Dare to be different, yet hold your head high”: the impact of Prince’s death on Wikipedia

    Photo by Zarateman, CC BY-SA 4.0.

    “Prince is synonymous with indulging in yourself shamelessly and with passion. His music and his cultural impact mean more than words can offer.”
    “He taught me that being black was more than a color. That despite our circumstances or limitations in life, you could change the world and you could reach out and touch the stars.”
    “Dare to be different, yet hold your head ... Read more

  2. Introducing the unique devices dataset: a new way to estimate reach on Wikimedia projects

    Photo by Tiago Aguiar, public domain.

    With the unique devices dataset, we’ve been able to quantify the shift to mobile across all projects. In almost all Wikimedia projects, more than half of our unique devices are accessing content using the mobile sites.... Read more

  3. 15 years of Wikipedia in data visualization

    Image by Stephen Laporte, CC BY-SA 3.0.

    Leave it to designers to show us how dynamic, global, and human Wikipedia really is. Here, we look at fifteen of the best data visualizations making use of Wikipedia data.... Read more

  4. Wikipedia’s very active editor numbers have stabilized—delve into the data with us

    Graph by Joe Sutherland, in the public domain.

    Very active editor numbers (>100 edits per month) since the English Wikipedia’s launch in 2001. The thick red line symbolises a five-month moving average. Graph by Joe Sutherland, in the public domain. The English Wikipedia’s population of very active editors—registered contributors with more than 100 edits per month—appears to have stabilized after a period of decline. We’... Read more

  5. Growing free knowledge through open data

    This Sankey diagram shows how readers reach the English Wikipedia article about London and where they go from there, based on the Wikipedia Clickstream data set. Graph by Ellery Wulczyn and Dario Taraborelli, CC0.

    Open data can help us understand how people find and share knowledge online. The Wikimedia Foundation’s Research and Data Team has published 5 open data sets about Wikimedia projects. (…)... Read more

  6. What are readers looking for? Wikipedia search data now available

    (Update 9/20 17:40 PDT)  It appeared that a small percentage of queries contained information unintentionally inserted by users. For example, some users may have pasted unintended information from their clipboards into the search box, causing the information to be displayed in the datasets. This prompted us to withdraw the files. We are looking into the feasibility of publishing search logs at an... Read more

  7. Improving the accuracy of the active editors metric

    We are making a change to our active editor metric to increase accuracy, by eliminating double-counting and including Wikimedia Commons in the total number of active editors. The active editors metric is a core metric for both the Wikimedia Foundation and the Wikimedia communities and is used to measure the overall health of the different communities. The total number of active editors is defined ... Read more

  8. US Education Program participants add three times as much quality content as regular new users

    Wikipedia Education Program participants from the United States added more than three times as much quality content as regular new users, a quantitative analysis shows. In the Wikipedia Education Program, professors assign their students to edit Wikipedia articles as a grade for class, assisted by volunteer Wikipedia Ambassadors. In fall 2011, 55 courses participated in the program in the United S... Read more

  9. Techies learn, make, win at Foundation’s first San Francisco hackathon

    In January, 92 participants gathered in San Francisco to learn about Wikimedia technology and to build things in our first Bay Area hackathon.... Read more

  10. Do It Yourself Analytics with Wikipedia

    As you probably know, we publish on a regular basis backups of the different Wikimedia projects, containing their complete editing history. As time progresses, these backups grow larger and larger and become increasingly harder to analyze. To help the community, researchers and other interested people, we have developed a number of analytic tools to assist you in analyzing these large datasets. To... Read more