Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

“A Thousand Fibers Connect Us” – WikiViz 2011 winner visualizes Wikipedia’s global reach

WikiViz 2011: Screenshot of the winning entry

In July, the International Symposium on Wikis and Open Collaboration (WikiSym) and the Wikimedia Foundation launched WikiViz 2011, a data challenge calling for submissions to visualize the impact of Wikipedia beyond the scope of its own community, using open data. At this week’s annual WikiSym conference in Mountain View, California, the author of the winning entry, Jen Lowe (Datatelling.com), presented her work, titled “A Thousand Fibers Connect Us – Wikipedia’s Global Reach”. Drawing on open data published by the Wikimedia Foundation and the World Bank, she designed an interactive visualization that allows users to explore the readership of different Wikipedia language versions by country, and to compare countries with high or low levels of internet access. The following is an excerpt from Jen’s talk at the WikiViz awarding ceremony.

Visualizing Emptiness: Reflections on a Preoccupation with Missing Values

The first question to be answered for any visualization is always: what data to use? I spent a lot of time looking for outward-facing data about Wikipedia. When I finally found data about Wikipedia traffic by country, I knew I had the connections I needed between the world and the world of Wikipedia.

I cleaned data with R and visualized it with Processing, both open source tools. The top represents countries, colored by region and more broadly by global north (blue) and south (red). The bottom represents languages. Connections represent over 100,000 page requests in the year from April 2010 to March 2011. It’s interactive, countries and regions can be highlighted, and sorted by population, pageviews, pageviews per person, and internet access. All data is transparently available on rollover.

Jen Lowe presenting her visualization at WikiSym

Missing Values

I think that visualization is amazing for its ability to force us to see what’s missing; to see the missing values in a collection of data. Anyone who has experience with data analysis, especially with analyzing other people’s data, knows the feeling of being totally preoccupied with missing values: how are they represented in the dataset? How should we deal with them – bootstrap to fill them in, or throw out the associated data completely? I find that visualization trains my mind to notice what’s missing.

Missed Connections in the Global South

When I sort by region, I can force you to see the emptiness, the missed connections in the global south. The more I do visualization work, the more I notice who’s missing, not just globally, but personally.

Conclusions

There are people in the empty spaces of the visualization who want to be Wikipedia editors, who want to contribute, but don’t know it exists, or don’t see a way in. Openness is easy – you can just attach a license and say something is open. Accessibility is hard – it requires someone to take responsibility, to commit sustained effort. So – the goal I propose is: we meet back in 10 years and see the circle FILLED. No more missing values, no more missed connections, no more empty spaces. With the quantity of Wikipedia data being collected, we will be able to see, rather than speculate on, exactly how a diversity of voices has changed patterns of edits, the content, and the connections of Wikipedia. We will all have a Wikipedia for everyone, that reflects the collaborative contributions of everyone.

Quotes from the jury

Erik Zachte, data analyst for WMF, says:

I find this visualization extremely elegant, even mesmerizing. It is a joy to play with the different options, and to watch how the screen responds. Part of its appeal is its complexity: It resonates with how many people see Wikipedia – colossal and manifold, it is not so easy to grasp its inner workings. Coupled with the orderly presentation, this complexity invites the user to dive in, and perhaps be the first to find some new treasure, some hidden pattern

Moritz Stefaner, information visualizer, commented:

The visualization is very rich in data and navigation modes. I much applaud the audacity to include this much data, navigation modes, and detail information, this has certainly been a great effort. The amount and density of the data is staggering.

“A Thousand Fibers Connect Us” is released under a Creative Commons BY-SA license and the underlying code will be published under an open license shortly.

Tilman Bayer, WMF Movement Communications

Dario Taraborelli, WMF Senior Research Analyst and WikiViz co-chair.

5 Responses to ““A Thousand Fibers Connect Us” – WikiViz 2011 winner visualizes Wikipedia’s global reach”

  1. dario says:

    There are no runners-up as this was the only work among the submissions that the jury considered eligible for the final shortlist.

  2. Michael says:

    Awesome and a worthy winner! I read something about two runners-up. Is their work documented somewhere?

  3. Tilman says:

    Slides and a transcript from the WikiSym talk are now available on Lowe’s blog: “Visualizing Emptiness: Reflections on a Preoccupation with Missing Values”

  4. neitway says:

    “There are people in the empty spaces of the visualization who want to be Wikipedia editors, who want to contribute, but don’t know it exists, or don’t see a way in.”,am agree with it.

  5. Kellerkind says:

    “There are people in the empty spaces of the visualization who want to be Wikipedia editors, who want to contribute, but don’t know it exists, or don’t see a way in.”

    I’m a bit confused abaut this. How do we know that? Did they say anything? (Could be) Maybe they don’t want an encycopedia at all. Maybe they want their own newspaper. Or anything else to spread knowledge. IMHO it’ll be more polite to ask them about their needs.

    Kind regards.

Leave a Reply