Sharing in the sum of all human knowledge requires us to go to the sources. Beyond citations to books, journals, and websites, knowledge comes alive through images, video, and audio footage. We can travel to the beginnings of human history and admire the beauty of the Venus of Brassempouy carved from mammoth ivory 25,000 years ago. We can marvel at 2000-year-old mummy portraits that capture the dead in vivid colors. We can immerse ourselves in an Easter procession of the 19th century painted in incredible realism by Ilya Repin. We can listen to the earliest sound recording of a human voice, which could only successfully be played back two years ago for the first time.

Galleries, libraries, archives, and museums (a collective we refer to as “GLAM”) document, showcase, preserve and protect our cultural treasures. The Internet gives us the opportunity to share digital entry points to the fuller experience that cultural institutions can offer. With more than 340 million unique visitors every month, Wikipedia is the central entry point for research in the Internet-connected world.

The international Wikimedia volunteer movement is therefore naturally aligned with the public service mission of cultural institutions. Over the last year, we have seen an acceleration of partnerships to bring content online. This is also a result of the emergence of Wikimedia’s world-wide presence through chapter organizations founded by volunteers, which exist in 27 countries.

For the first time, we now have compelling data that shows the success of these partnerships, and the virtuous circle they can inspire. We also can use the same metrics to track the success of Wikimedia’s other content outreach initiatives.

Measuring success

Developing improved content usage metrics was one of the key priorities identified at the Multimedia Usability Meeting in Paris (see previous report). Thanks to the work done by Bryan Tong Minh, who attended the meeting, the usage of every media file in our media repository is now fully tracked across different Wikimedia projects and languages. Based on this, Magnus Manske, another volunteer and Paris attendee, developed two useful scripts that help us track the usage of entire collections of content:

  • Glamorous“, which enumerates where media from a collection are used (e.g. which Wikipedia languages);
  • Amalglamate“, which tracks comparative collection usage data over time (starting January 12).

Using these scripts, we can analyze the impact of our content partnerships in real-time. For example:

In December 2008, Wikimedia Germany developed a partnership with the German Federal Archives resulting in the donation of 80,000 images, most of which relate to German history. As required by Wikimedia policy, these images were donated under a free content license which allows anyone to re-use them, provided proper credit is given.

Of the 82,458 images uploaded, 18.3%, or 15,109 images, are in active use in Wikimedia’s projects (e.g. Wikipedia, Wikinews, Wikibooks).


B 145 Bild-F057884-0009

The most frequently used [1] photograph from the collection is the photograph of Willy Brandt, German Chancellor from 1969 to 1974. It is used in 60 language editions of Wikipedia, with a total of 83 uses.

Effectively, this photograph of Willy Brandt becomes an iconic image that web users from around the world will see when researching the politician, in any of these languages: Aragonese, Arabic, Azeri, Belarusian, Bulgarian, Breton, Bosnian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Fiji Hindi, Finnish, French, Galician, Georgian, German, Greek, Hebrew, Hungarian, Indonesian, Icelandic, Ido, Italian, Japanese, Korean, Kurdish, Latin (!), Lithuanian, Low Saxon, Lower Sorbian, Macedonian, Norwegian, Occitan, Persian, Polish, Portuguese, Quechua, Romanian, Russian, Serbian, Serbo-Croatian, Slovak, Swahili, Swedish, Tagalog, Tajik, Turkish, Ukrainian, Vietnamese, and Welsh. And it’s just one of more than 15,000 images from the collection that are already in active use, about a year after first being made available.

These tools do not yet show the number of pageviews of the articles in question, although that data is available. For example, the German Wikipedia article about Willy Brandt was viewed 38,449 times in December 2009. Considering the combined language usage of Wikipedia, the use of images in many articles creates a large aggregate impact.

Like all media files in Wikimedia Commons, the image is available under a free content license, the Creative Commons Attribution/Share-Alike License. This means that it is usable by third parties as well, provided that proper credit is given. Tracking third party usage is, of course, more difficult. The MediaWiki software powering Wikimedia projects has built-in support for Wikimedia Commons (called “InstantCommons“), meaning that any wiki, anywhere, can immediately use files uploaded to Wikimedia Commons if this feature is enabled. For example, you can view the Willy Brandt image on WikiEducator (not a Wikimedia project), with all the same metadata, even though it has never been uploaded there. In the future, we may be able to track image usage across third party MediaWiki installations as well.

The Virtuous Circle

Not only do these images enrich articles in many languages, they also make it easier for people in languages that don’t have an article to get started. And, importantly, they drive awareness of the cultural institutions that provided them — as each and every image carries a visible seal when clicked:

Image (2) Barchseal.png for post 1562

Note how even the seal itself has been translated into 23 languages already. The images carry the original metadata provided by the Bundesarchiv:

Image (3) barch-metadata.png for post 1562

This links back to a copy hosted on the archive’s servers. Because the descriptions and other data in the records of the German Federal Archives sometimes contain errors, there’s a dedicated page that lets volunteers submit corrections. This page is regularly reviewed by the archive’s employees, and corrections are incorporated into its records.

The usage of the images therefore drives interest in the content, awareness of the institutions, improvements of the metadata — and hopefully incentivizes other institutions to follow. Since the German Federal Archives, several large content partnerships have been established:

  • The donation of 250,000 historic images by the German “Fotothek” (more info)
  • The donation of 39,000 images about Suriname and Indonesia by the Dutch Tropenmuseum (more info), with more to follow

Beyond partnering with cultural institutions, Wikimedia chapters have also taken a leadership role in documenting the world around us through picture competitions, expeditions, and workshops. The aforementioned metrics can be used to track which models produce content that ends up being widely used in Wikimedia’s projects. Examples include:

The usage of images from these and other initiatives will now be tracked over time. Of course, having such metrics is only the beginning, and WMF will invest in global program support capacity to ensure that we learn from, document, and incentivize best practices.

Managing growth

Altogether, Wikimedia Commons has achieved extraordinary growth over the past year. Launched in September 2004, it took two years for the multimedia repository to reach the milestone of one million files. We’re now at almost six million files, two million of which were added in the last 12 months.  More content partnerships, new video functionality, and improved usability (see earlier post) will further accelerate this growth.

Thanks to Wikimedia’s large network of supporters, we can keep up with this growth. It’s been a much closer call this time than we would like, as the chart below showing our recently shrinking media storage capacity illustrates (out of a total of 8 terabytes):

Image (4) freespace.png for post 1562

But yesterday, we put into service a new media storage server which more than triples our total storage capacity (it will be redundantly mirrored to a second server with the same capacity). This, too, is likely only the beginning. Wikimedia Commons is not comparable to websites like Flickr or Picasa: it does not aim to document vacations, parties, and precious life moments. It is a repository of educational media. But there’s a world full of riches waiting to still be brought closer to the minds of millions.

Erik Moeller
Deputy Director, Wikimedia Foundation

[1] excluding the use of images for purposes of navigation and topical representation on a large number of articles

Contact a local Wikimedia chapter

Further reading:

Upcoming events:

  • On April 13, 2010, Wikimedia volunteers and Wikimedia Foundation representatives will participate in a one-day workshop as part of the “Museums and the Web 2010” conference (“Wikimedia@MW2010“) to further explore and promote the active engagement between the communities.
  • On January 31, 2010, Wikimedia UK will kick off Britain Loves Wikipedia, a month-long photo competition that invites the general public to take photos of cultural treasures in participating institutions, for the primary purpose of illustrating Wikipedia articles