Wikimedia blog

News from inside the Wikimedia Foundation.org

Archive for January, 2010

Britain Loves Wikipedia competition starts 31 January 2010

Wikimedia UK logoStarting 31 January and during the entire month of February 2010, participating museums in Great Britain are joining with people from all ages, backgrounds and communities to celebrate Britain Loves Wikipedia.  The public is encouraged to photograph the multitude of national treasures contained in Britain’s collections, releasing them under a free license to be used to illustrate Wikipedia articles and much more.

The initiative is being spearheaded by the volunteer chapter based in the United Kingdom, Wikimedia UK.  Wikimedia’s volunteer chapters (which now number at 27 and continue to grow) support the movement by carrying out fundraising, public outreach, and relationship building in their respective territories.

You can read more about Britain Loves Wikipedia on the Wikimedia UK blog here. If you’re in the UK through the coming month, join up and help grow Wikimedia’s collection of freely reusable images and media!

Cary Bass, Volunteer Coordinator

Danese Cooper, our new CTO

I’m delighted to announce that the Wikimedia Foundation has hired a new Chief Technical Officer, Danese Cooper. Danese is an experienced technology manager and open-source evangelist. Danese will start with Wikimedia on February 4, 2010.  You can read more about this great news in the Foundation press release that went out today.

Danese has a wealth of experience in open source technology. Most recently, she developed open source strategy for the tech start-up REvolution Computing. Prior to that, she was Senior Director of Open Source Strategies at Intel from 2005 until 2009, and Chief Open Source Evangelist at Sun Microsystems from 1999 to 2005. In those roles, she led or supported major open source initiatives, including Sun’s OpenOffice.org application suite, the Java platform, JXTA, NetBeans, GridEngine, OpenSolaris and Intel’s Channel Software Operations and Moblin platform initiatives. Prior to working at Sun, she managed technology teams at Symantec and at Apple Computing for a total of nine years.

Danese is a Board member at the Open Source Initiative, the non-profit organization that maintains the Open Source Definition and approves open source software licenses. She is also a member of the Apache Software Foundation, and serves on a Special Advisory Board for Mozilla.

As CTO, Danese will be responsible for ensuring Wikipedia and the other Wikimedia projects run reliably and perform well from a technical standpoint. She will also be responsible for supporting and developing Wikimedia’s open source software stack including MediaWiki, and for creating technical strategy and technical projects to support increases in Wikimedia projects’ reach, quality and participation. Her background as an evangelist will be particularly important, because the health of the Wikimedia volunteer developer community is critical to Wikimedia’s ability to successfully serve people in multiple geographies and languages.

We’d also like to thank the Walker Talent Group for its pro bono work helping recruit Danese, as well as Advisory Board member Roger McNamee for introducing Wikimedia to Walker. Their help is much appreciated.

Sue Gardner,
Executive Director, Wikimedia Foundation

Enriching Wikimedia Commons: A Virtuous Circle

Sharing in the sum of all human knowledge requires us to go to the sources. Beyond citations to books, journals, and websites, knowledge comes alive through images, video, and audio footage. We can travel to the beginnings of human history and admire the beauty of the Venus of Brassempouy carved from mammoth ivory 25,000 years ago. We can marvel at 2000-year-old mummy portraits that capture the dead in vivid colors. We can immerse ourselves in an Easter procession of the 19th century painted in incredible realism by Ilya Repin. We can listen to the earliest sound recording of a human voice, which could only successfully be played back two years ago for the first time.

Galleries, libraries, archives, and museums (a collective we refer to as “GLAM”) document, showcase, preserve and protect our cultural treasures. The Internet gives us the opportunity to share digital entry points to the fuller experience that cultural institutions can offer. With more than 340 million unique visitors every month, Wikipedia is the central entry point for research in the Internet-connected world.

The international Wikimedia volunteer movement is therefore naturally aligned with the public service mission of cultural institutions. Over the last year, we have seen an acceleration of partnerships to bring content online. This is also a result of the emergence of Wikimedia’s world-wide presence through chapter organizations founded by volunteers, which exist in 27 countries.

For the first time, we now have compelling data that shows the success of these partnerships, and the virtuous circle they can inspire. We also can use the same metrics to track the success of Wikimedia’s other content outreach initiatives.

Measuring success

Developing improved content usage metrics was one of the key priorities identified at the Multimedia Usability Meeting in Paris (see previous report). Thanks to the work done by Bryan Tong Minh, who attended the meeting, the usage of every media file in our media repository is now fully tracked across different Wikimedia projects and languages. Based on this, Magnus Manske, another volunteer and Paris attendee, developed two useful scripts that help us track the usage of entire collections of content:

  • Glamorous“, which enumerates where media from a collection are used (e.g. which Wikipedia languages);
  • Amalglamate“, which tracks comparative collection usage data over time (starting January 12).

Using these scripts, we can analyze the impact of our content partnerships in real-time. For example:

In December 2008, Wikimedia Germany developed a partnership with the German Federal Archives resulting in the donation of 80,000 images, most of which relate to German history. As required by Wikimedia policy, these images were donated under a free content license which allows anyone to re-use them, provided proper credit is given.

Of the 82,458 images uploaded, 18.3%, or 15,109 images, are in active use in Wikimedia’s projects (e.g. Wikipedia, Wikinews, Wikibooks).

The most frequently used [1] photograph from the collection is the photograph of Willy Brandt, German Chancellor from 1969 to 1974. It is used in 60 language editions of Wikipedia, with a total of 83 uses.

Effectively, this photograph of Willy Brandt becomes an iconic image that web users from around the world will see when researching the politician, in any of these languages: Aragonese, Arabic, Azeri, Belarusian, Bulgarian, Breton, Bosnian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Fiji Hindi, Finnish, French, Galician, Georgian, German, Greek, Hebrew, Hungarian, Indonesian, Icelandic, Ido, Italian, Japanese, Korean, Kurdish, Latin (!), Lithuanian, Low Saxon, Lower Sorbian, Macedonian, Norwegian, Occitan, Persian, Polish, Portuguese, Quechua, Romanian, Russian, Serbian, Serbo-Croatian, Slovak, Swahili, Swedish, Tagalog, Tajik, Turkish, Ukrainian, Vietnamese, and Welsh. And it’s just one of more than 15,000 images from the collection that are already in active use, about a year after first being made available.

These tools do not yet show the number of pageviews of the articles in question, although that data is available. For example, the German Wikipedia article about Willy Brandt was viewed 38,449 times in December 2009. Considering the combined language usage of Wikipedia, the use of images in many articles creates a large aggregate impact.

Like all media files in Wikimedia Commons, the image is available under a free content license, the Creative Commons Attribution/Share-Alike License. This means that it is usable by third parties as well, provided that proper credit is given. Tracking third party usage is, of course, more difficult. The MediaWiki software powering Wikimedia projects has built-in support for Wikimedia Commons (called “InstantCommons“), meaning that any wiki, anywhere, can immediately use files uploaded to Wikimedia Commons if this feature is enabled. For example, you can view the Willy Brandt image on WikiEducator (not a Wikimedia project), with all the same metadata, even though it has never been uploaded there. In the future, we may be able to track image usage across third party MediaWiki installations as well.

The Virtuous Circle

Not only do these images enrich articles in many languages, they also make it easier for people in languages that don’t have an article to get started. And, importantly, they drive awareness of the cultural institutions that provided them — as each and every image carries a visible seal when clicked:

Note how even the seal itself has been translated into 23 languages already. The images carry the original metadata provided by the Bundesarchiv:

This links back to a copy hosted on the archive’s servers. Because the descriptions and other data in the records of the German Federal Archives sometimes contain errors, there’s a dedicated page that lets volunteers submit corrections. This page is regularly reviewed by the archive’s employees, and corrections are incorporated into its records.

The usage of the images therefore drives interest in the content, awareness of the institutions, improvements of the metadata — and hopefully incentivizes other institutions to follow. Since the German Federal Archives, several large content partnerships have been established:

  • The donation of 250,000 historic images by the German “Fotothek” (more info)
  • The donation of 39,000 images about Suriname and Indonesia by the Dutch Tropenmuseum (more info), with more to follow

Beyond partnering with cultural institutions, Wikimedia chapters have also taken a leadership role in documenting the world around us through picture competitions, expeditions, and workshops. The aforementioned metrics can be used to track which models produce content that ends up being widely used in Wikimedia’s projects. Examples include:

The usage of images from these and other initiatives will now be tracked over time. Of course, having such metrics is only the beginning, and WMF will invest in global program support capacity to ensure that we learn from, document, and incentivize best practices.

Managing growth

Altogether, Wikimedia Commons has achieved extraordinary growth over the past year. Launched in September 2004, it took two years for the multimedia repository to reach the milestone of one million files. We’re now at almost six million files, two million of which were added in the last 12 months.  More content partnerships, new video functionality, and improved usability (see earlier post) will further accelerate this growth.

Thanks to Wikimedia’s large network of supporters, we can keep up with this growth. It’s been a much closer call this time than we would like, as the chart below showing our recently shrinking media storage capacity illustrates (out of a total of 8 terabytes):

But yesterday, we put into service a new media storage server which more than triples our total storage capacity (it will be redundantly mirrored to a second server with the same capacity). This, too, is likely only the beginning. Wikimedia Commons is not comparable to websites like Flickr or Picasa: it does not aim to document vacations, parties, and precious life moments. It is a repository of educational media. But there’s a world full of riches waiting to still be brought closer to the minds of millions.

Erik Moeller
Deputy Director, Wikimedia Foundation

[1] excluding the use of images for purposes of navigation and topical representation on a large number of articles

Contact a local Wikimedia chapter

Further reading:

Upcoming events:

  • On April 13, 2010, Wikimedia volunteers and Wikimedia Foundation representatives will participate in a one-day workshop as part of the “Museums and the Web 2010” conference (“Wikimedia@MW2010“) to further explore and promote the active engagement between the communities.
  • On January 31, 2010, Wikimedia UK will kick off Britain Loves Wikipedia, a month-long photo competition that invites the general public to take photos of cultural treasures in participating institutions, for the primary purpose of illustrating Wikipedia articles

Deployment of Babaco Enhancements

The Usability Team is preparing a supplemental release which will bring more stability and functionality to the features of their Babaco release, which has been available to logged in users of Wikimedia sites since October 2009. Among the changes which have been made are many improvements in interactivity and aesthetics, but the most critical change is using an HTML iframe element together with a special design mode that modern browsers support, in favor of the previous HTML textarea. This paves the way for developing a rich editing experience with collapsible templates and syntax highlighting, as well as provides a foundation on which a WYSIWYG editor may eventually be built upon.

The table of contents, which now features controls for expanding, collapsing and resizing, is also much more accurate than before. The link dialog which once had a tabbed interface for making internal or external links now intelligently detects the type of link you are making, an improvement we designed and prototyped while literally watching users struggle with the software during usability testing. Finally, there is now support for language specific icons in the toolbar, with a several languages ready to go such as German, French, Spanish, Dutch, Portuguese, and Polish. This feature allows us to provide a more native experience by using language specific mnemonics. So far we’ve applied this feature to the bold and italic buttons, which are now B and I for English, F and K for German and G and C for Spanish. Languages without customized icons will continue using A and A.

The team will be deploying these upgrades in the coming days. To learn more about the Wikipedia Usability Initiative, check out their website at http://usability.wikimedia.org.

Multimedia Usability Project Underway

Some new faces have joined the Foundation’s multimedia usability project, and important developments are underway to improve uploading and sharing of multimedia materials on Wikimedia’s projects.

We are excited that Guillaume Paumier, Product Manager of the Ford multimedia usability project , has moved from Toulouse, France and joined the Wikimedia usability team at our San Francisco office. We are also excited that Neil Kandalgaonkar has joined the multimedia usability project as a software developer. Neil brings in rich technical background from major social networking sites such as Flickr and Upcoming.org.

The multimedia usability project will focus on the following three areas:

  • Simplify and streamline the media uploading process to Wikimedia Commons
  • Create a staging area where incomplete work can be reviewed and amended
  • Integrate interwiki uploading so that uploads from Wikimedia projects are directed to Commons (the binary repository for all Wikimedia projects) seamlessly, and support moving of existing files from Wikimedia projects to Commons

These focuses were determined based on the discussion at the multi-media usability meeting in Paris with active Wikimedia Commons contributors and the objective of the Ford grant to increase participation to Wikimedia Commons.

We have a lot of ideas and features we would like to improve, but our resources are limited; we need to prioritize and focus on a few core changes. In order to accomplish various aspects of the usability of Wikimedia Commons, collaboration with the volunteer contributor community and partnership with our global chapters will be vital to achieve successful results. The multi-media usability meeting in Paris sponsored by Wikimedia France was immensely valuable to set the groundwork for this project. Wikimedia Deutschland is leading an initiative in the development of multi lingual search so that rich internationalized content can be retrieved by a global audience.

Guillaume has been actively publishing his initial research work, the survey result, user interviews and domain research to the multimedia hub of the usability wiki. He is also working on the initial mock-ups of simplified user work flow for uploading media files to Commons. Have a look and post your feedback on the discussion page.

We are also working with Michael Dale to integrate an Add-Media-Wizard into the enhanced Wikimedia project toolbar which is currently offered as a part of the usability beta. Add-Media-Wizard allows users to search relevant media files from Wikipedia articles and insert into the article without leaving the editing window. Michael’s work is already available as a gadget, but the plan is to offer to wider audience by integrating into user preferences. To have a sneak peek of this feature, you can visit the usability sandbox. Just click the image icon in the toolbar, and you can experience the intuitive way of including media assets. Please be aware that the sandbox is an experimental area, so the condition of the software changes constantly.

If you’re online and have access to IRC you can join the multimedia usability team for ‘office hours,’ where we’ll be available live to take questions and discuss ongoing work around the usability project. The next office hours take place Thursday, February 4 at 9AM to 10AM PDT (16:00 to 17:00 UTC). Visit the IRC Office Hours planning page for more info and for assistance in joining the conversation.

More usability improvements are coming! Stay tuned.

Naoko Komura
Program Manager
Wikimedia Usability Inititative

Second annual report is now available

We’re very pleased to announce the release of the Foundation’s second annual report, covering the 2008/2009 fiscal year.  The Foundation’s annual report covers a full year of activity, highlighting our fiscal operations, programs and outreach successes, major milestones, and of course the work of thousands of volunteers and chapters around the world.

This year you’ll find a new annual time-line that showcases major events through the year, and also a center spread featuring details and facts about an incredible article created during the previous fiscal year.

As always, the images and information in the report are all under the creative commons CCBYSA 3.0 license, including images from many Wikimedia volunteers with principal photography by Lane Hartwell, a San Francisco-based photographer.  This year’s report was designed by Exbrook, a design strategy firm based in San Francisco led by Rhonda Rubenstein and David Peters.

Thanks to all who supported the production – looking forward to hearing your comments and suggestions.

Jay Walsh, Communications

Bugzilla upgrade.

Thanks to Priyanka’s wonderful work in theming Bugzilla and ironing out the last couple bugs and extensions, we are finally ready to move on with the upgrade.Bugzilla_Logo As a side effect, Bugzilla will be down for a couple of hours (let’s say 2 to be on the safe side) around lunch-time.  (Edit Addition: 2010-01-19 between 20:00 GMT and 22:00 GMT)

Another Year Wiser

On this day in 2001, Wikipedia, a small, experimental project with a big mission was introduced to the world by Founder Jimmy Wales. Nine years later, growing at unprecedented speed, due to the dedicated and enthusiastic support of volunteers and contributors, Wikipedia has evolved into one of the most important sources of free information and knowledge in the world.  Happy birthday, Wikipedia!

In 2002, only celebrating one year of existence, Wikipedia grew to over 20,000 articles– and in its second year,  130,000 articles in 28 languages. To wax nostalgic, you can turn back the hands of time and rummage through “vintage” Wikipedia here. Since then, over 14 million articles in 270 languages have been created.  For millions of people everywhere, Wikipedia has become an indispensable part of their daily lives– a resource relied on by hundreds of millions of  visitors a month — and growing.

In celebration of this milestone, Wikipedians in Bangalore, India and New York City have planned events.  If you’re celebrating with us, post a comment and tell us how, or even better, add photos to Wikimedia Commons.

Thanks, Wikipedia! Here’s to another year wiser!

Moka Pantages, Communications

Hebrew Wikipedia Breaks 100,000

Clocking in at 52 million words today, Hebrew Wikipedia announced they’ve reached 100,000 articles.  At 16:54 UTC, user: Brookli submitted an article about Seaton Delaval Hall, a 16th century English country home. The largest encyclopedia written in the Hebrew language got its start six and half years ago, July 9, 2003, with an article on Mathematics by user: Rotem Dan.

Hebrew Wikipedia is among the top 30 language Wikipedias of 270 languages worldwide.  To commemorate the event, on Friday, January 15th, the community will hold a meeting in Israel at Tel Aviv University where 100 Wikipedia contributors will discuss the state of Wikipedia today and in the future. This event will be streamed live beginning at 07:30 UTC on Friday:  http://www.livestream.com/wikipediaisrael.

Congratulations to all of those who helped Hebrew Wikipedia reach such an important milestone.

Mazal Tov!

Moka Pantages, Communications


Flagged Revisions: Your questions answered!

There has been a lot of interest lately in Flagged Revisions, a quality control mechanism for MediaWiki. In particular, people want to know when and how that’s getting used on the English language Wikipedia.

I’m William Pietri, a San Francisco software consultant who recently came on part time to do project management for this. In addition to my long experience building web software for on-line communities, I’m also a Wikipedian. Although I haven’t done much more than small corrections lately, I started editing in 2004, and became an admin in 2007.

My job on this has two main parts. The first is to make sure that everybody working on this gets everything they need to make progress. The second is to communicate progress to the wider world. In the spirit of open communication, this is an update in question-and-answer form. Mostly from real questions people have actually asked me, but I’m going to sneak in a couple that I expect somebody will ask shortly.

What is Flagged Revisions?

You can find more detail here, but Flagged Revisions is basically a way to insert a quality review step between someone editing an article and that article version being published for the general public to see. It has been in use on the German Wikipedia since May 2008, and implemented in other languages and projects since then (see this page on Meta for a full list). Typically, in those use cases, every single article is treated in this way, and every change by a new user has to be reviewed.  There are a number of ways Flagged Revs can be used, and the proposed implementation for English Wikipedia (described below) is quite different.

Fundamentally, the objective of this technology is to reduce the exposure of readers both to subtle and not-so-subtle malicious changes in articles (whether it’s the insertion of blatant nonsense, or claiming the death of a celebrity), and to reduce the workload of people patrolling these changes by reducing duplication of effort.

What about the English language Wikipedia?

The use of Flagged Revisions on the English Wikipedia has been under discussion for a long time. Ultimately, the English Wikipedia volunteer community developed a proposal titled “Flagged protection and patrolled revisions“, which garnered strong support. It is fundamentally different from the way the technology has been used so far. Instead of requiring every change by a new or untrusted user to be reviewed, the mechanism would be activated on a per-page basis only, as an alternative to existing tools to restrict editing.

Notably, thousands of articles in the English Wikipedia, typically pages with a very high risk of malicious editing (e.g. major political figures), are currently “semi-protected”, meaning that new or unregistered users cannot make any changes at all. This new tool would make it possible to open up these pages for editing, provided that potentially problematic changes receive positive review. As a result of the more open approach, more high-risk pages could be made subject to this level of community moderation.

Initially, the English Wikipedia volunteer community wants to trial the system for two months. In addition to this alternative to page protection, the proposal calls for implementation of a new feature called “patrolled revisions”, which doesn’t impact what readers see, but is designed to make it easier for change patrollers to organize their work.

How is the Wikimedia Foundation responding to this proposal?

The Wikimedia Foundation (WMF), together with Wikimedia Germany, has driven and funded the development of the FlaggedRevisions technology since 2007 to the point where it has been able to scale to more than a year of production use in our second-largest project, the German language Wikipedia. WMF has carefully reviewed the English Wikipedia proposal, and allocated resources to assess its impact and support implementation along the principles outlined in the proposal.

The technology as proposed markedly differs from the way it’s been used before, so it’s a substantial development and design effort to get it right. For example, the notion of a per-page setting necessitates an entire set of user interface changes that allow changing the setting of a page, and that make it clear to a reader what the state of a particular page is. See this mock-up by Howie Fung as an example of what revised per-page controls could look like. WMF will post further mock-ups for feedback and prototypes for testing as they are built.

Who is currently working on the project?

  • Aaron Schulz, a contract developer with Wikimedia, is the lead developer of FlaggedRevisions.
  • Howie Fung, a contract product manager who also works with Wikimedia’s Usability Initiative, is supporting usability and product review of the software.
  • I, William Pietri, support the project management of the English Wikipedia roll-out as described above.
  • Erik Zachte, Wikimedia’s Data Analyst, will develop metrics specifically assessing the impact of the English Wikipedia rollout.

When will it be done?

This question has been asked a lot lately. Because this isn’t the flipping of a switch but a software development project, answering it requires me to let you in on a secret about software development projects. There are basically four ways to deal with dates, but only three of them are sane:

  1. It’s done when it’s done. Nobody mentions dates. The developers code until they’re finished. Then you release, get feedback, and code some more.
  2. Measure progress and project dates. You lay out all the work, estimate relative size, and then measure how much you get done over time. That data is used to figure out release dates.
  3. Pick a date and release whatever you finish. If you’re building, say, annual tax return software, it’s better to ship on time and drop features than it is to finish late with everything.
  4. Make up dates to please people. This is very popular, and has the advantage of making people happy at first, but it rarely works out well.

Until recently, we were using the first approach. That’s how most Mediawiki development (and most other open source development) works, and it has many advantages. But because a lot of people are eager for this project to launch, we’re shifting to the second approach.

The developer, Aaron Schulz, has estimated all of the items on the work list and already started in on them. The holidays complicate things some, but I expect we’ll have enough data to make a first guess at the estimated release date by the middle of January.

Wait, there’s only one developer on this? Is the Wikimedia Foundation taking this seriously?

Yes, absolutely. Aaron has been working on the Flagged Revisions extension for years, and nobody knows it better. We talked about adding developers, but unfortunately adding more people now wouldn’t help. I haven’t dug into the history much, but it looks like the real slowdown lately wasn’t in the coding; it was in turning the many-voiced community response into a clear set of things to do.

Having spent time with all the people involved, it’s clear to me that the Foundation takes this project very seriously. It’s one of small number of high-priority projects, which include things like keeping the site running, organizing the annual fundraiser, and Wikimedia’s usability initiative.

How can I keep track?

There are a few ways. First, we’ll mention big updates (and the eventual release) here on this blog. Second, keep an eye on the labs site. That will be updated regularly with the latest code and configs. You can judge for yourself how we’re doing, and make sure we do it right. And third, I’ve put the work queue into a public web-based tool called Pivotal Tracker. It’s one of the few software project management tools made for the measure-and-project approach we’re using; if you’re the sort of person who likes way too much detail, you can find real-time updates there.

How can I get involved?

Go to the labs site, play with the current implementation, give feedback, post your own user interface design suggestions, report bugs, and so forth.  Further community discussion in the English Wikipedia about the proposed roll-out is happening on the page about flagged protection and patrolled revisions. I’m always glad to hear feedback, either on my talk page or via email. And of course, you can comment on this very blog post.

William Pietri
Contractor, Wikimedia Foundation