Wikimedia blog

News from inside the Wikimedia Foundation.org

Posts by Erik

October 2011 Coding Challenge winners

In October 2011, we tried a new experiment in attracting volunteer developers and advertising opportunities to get involved with Wikimedia’s open source codebase. The October 2011 Coding Challenge invited developers to submit projects in three categories:

  • Mobile Wikipedia: Uploading images and other media via your smartphone
  • Slideshows: Showcase Wikipedia’s beautiful multimedia
  • Realtime: Surface changes to Wikipedia’s content more dynamically

While we received lots of interesting submissions in all three categories, ultimately we had to pick three winners. The grand prize winners in each category get to attend a 2012 Wikimedia event of their choice at our expense. Two runners-up in each category are receiving a certificate of excellence acknowledging the high quality of their submission.

(more…)

Announcing the October 2011 Coding Challenge

Coding Challenge LogoGreat programmers drive any successful tech organization, and great programmers can be hard to find.Fortunately, the Wikimedia Foundation has a unique advantage: millions of unique visitors, every single day.It’s Wikipedia’s global impact which has enabled us to mobilize hundreds of thousands of donors every year to support our mission in our annual fundraising campaign.

We wanted to find out if we could find some of the world’s best programmers using similar means, to cultivate potential future volunteers and job candidates.  Thus, the idea of the Coding Challenge was born.

This is, admittedly, an experiment, and we’re not entirely sure how it will go.  We’ve structured it like a contest, and contests can be tricky, because they have rules.  One of the most important rules: only one contestant per challenge can win The Grand Prize (an all-expenses paid trip to an eligible Wikimedia event).   We run the risk of creating an ultra-competitive environment, in which people care more about winning than about helping to build a better Wikipedia.

Let’s reiterate, then: the goal of the contest is to find the best programmers — and at WMF, we get to decide exactly what that means.  (See the part in the rules about “sole discretion”.)  And a great programmer must write great code, of course — but the greatest programmers know that it means a lot more than that, too.  Is the documentation good?  Have ideas been exchanged on the wikitech-l mailing list?  Were participants generous with their time and ideas?  Do we see those ideas proliferating through the code of others?

There’s also a lot of value in this contest even to those who don’t “win”.  Maybe there’s only one grand prize per challenge, but WMF can, and will, bestow accolades to everyone who writes good code and shares it with everyone else.  And those things matter: when a potential employer comes to call, and you can point to the nice folks at Wikipedia to vouch for your work, that’s no small thing.

There are great potential programmers all over the globe.  If we can convince even a small fraction of you to accept one of these challenges, we will consider this little experiment a resounding success.

The Coding Challenge will run until November 7, 2011, 23:59 UTC.

Greg DeKoenigsberg, Coding Challenge Coordinator
Erik Moeller, VP of Engineering and Product Development

Expanded Use of Article Feedback Tool

Today on English Wikipedia we rolled out the Article Feedback Tool – previously featured on 3,000 English Wikipedia articles – to a larger set of 100,000 articles. This initial expansion is intended to further assess both the tool’s value and its performance characteristics, with an eye to a full deployment on Wikipedia and potentially other projects.

Some examples of articles that currently feature the tool (at bottom):

The intent of the tool is two-fold:

  • to gain aggregate quality assessments of Wikimedia content by readers and editors;
  • to use as an entry vector for other forms of engagement.

To assess its value in both categories, we’ve already undertaken a significant amount of qualitative and quantitative research. You can read an extensive summary of our work so far here.

The high level summary based on the data we’ve seen so far: We believe user ratings can be a valuable way to predict high and low quality content in Wikimedia, and we’re especially interested in engaging raters beyond the initial act of assessing an article. Through our trials to date, we’ve seen very good conversion rates on the calls-to-action that follow a rating, suggesting that this could be a powerful engagement tool as well.

Beyond continuing our own research and these engagement experiments, our goal is to regularly make available anonymized data from the tool, and to supply editors with a dashboard tool for surfacing trends in the rating data. We’re looking forward to sharing wider findings from the use of the tool soon.

Please use the talk page or comment below for feedback, questions and suggestions.

Erik Moeller
Deputy Director, Wikimedia Foundation

A new way to share pictures, sounds and video

UploadWizard uploading multiple files

On April 15, Wikimedia Commons celebrated its 10 millionth media file. A new feature will help to increase that number even faster. The upload wizard, which entered public beta in late November and has been used to upload more than 10,000 files already, is now the default upload tool on Wikimedia Commons. Use it and tell us what you think, as we continue to improve it.

Here’s what’s different:

  • Instead of overloading the user interface with information about licensing and acceptable content, there’s a single comic strip tutorial explaining the licensing policy, which can be dismissed after the first time you see it.
  • You only see complexity when you need to see it. There are sensible defaults for licensing, automatic metadata extraction from the uploaded files, etc.
  • You can upload up to 10 files as a batch, instead of having to upload each file individually. You can see thumbnails of the files you’re uploading, and abort any individual upload.
  • Error cases should be handled in a clear and understandable fashion, and guide the user towards the most sensible action (e.g. when a file needs to be renamed, the upload shouldn’t fail: instead, the tool will prompt that a rename is necessary).
  • As a final step, the UploadWizard explains how to add uploaded files to pages in Wikimedia projects.

And here’s what some of our experienced users have said during the beta:

  • “The upload wizard provides a much less cluttered and confusing upload process.”
  • “Great performance from the upload wizard. A lot of the more tiresome details are filled out automatically”
  • “Fantastic wizard makes process clearer, but please keep the old form for more experienced users. Thanks!”
  • “I never thought the old uploading process was too hard, but this new upload wizard is amazingly simple. It actually makes me want to upload more.”
  • “Much improved method of uploading files. Multiple file uploads, auto filling of dates, user name, etc, simplifies license input, all help to reduce time required to upload. Great work.”

The UploadWizard requires JavaScript (if JavaScript is disabled, you’ll get a simplified upload form instead). It’s been fully translated into Dutch, French, Galician, German, Hebrew, Indonesian, Interlingua, Macedonian, Malayalam, Portuguese, Russian, Slovenian,Tagalog, and Vietnamese (call for translations). Tell us what you think — and remember, if it doesn’t work for you, you can always go back to the old form. In the coming weeks, we’ll not only examine the impact that this new tool will have on the overall number of media uploads, but also whether it will lead to a larger percentage of deleted content (due to lower quality uploads). We will continue to improve the tool as we learn more.

Big thanks to the UploadWizard team — Neil Kandalgaonkar, Ryan Kaldari, Guillaume Paumier, Alolita Sharma — and to all code reviewers, operations engineers, translators and testers for their work on this project so far. We hope that you’ll enjoy the new upload experience. If you have images, sound files or videos with educational value that you’re willing to donate to the world, now is a good time to do it.

Erik Moeller
Deputy Director, Wikimedia Foundation

What’s in a name? In the case of ‘wiki’, lots of things.

Anyone who’s been watching the news will have heard about Wikileaks by now. Wikipedia shares the generic “wiki-” prefix in its name, but there’s no relation. Occasionally even major news sources like the BBC get this wrong, which can lead to serious confusion, even when it’s quickly fixed.

If anyone has a claim to the word “wiki”, it would be the Hawaiian people. In the Hawaiian language, wiki means “quick”. The words “wiki wiki” on a shuttle bus in Honolulu inspired software engineer Ward Cunningham to name a revolutionary piece of software – the “WikiWikiWeb” – in 1995. This software allowed people to instantly edit web pages, collaboratively.

Wikipedia was created six years later, based on the same principles. By that time, the word “wiki” was used already by a ton of different wiki software implementations. Today, you can go to the “WikiMatrix” website to compare them all. They have names like Wikidot, TWiki, or Wikispaces. Moreover, there are many, many content websites that use “wiki” in their names. Among them are Wikihow, Wikitravel, WikiAnswers, and Wikia.

Most of these projects are completely unrelated to Wikipedia. Wikipedia is operated by the non-profit Wikimedia Foundation, which was founded by Jimmy Wales in 2003. The Wikimedia Foundation operates a number of other free knowledge projects: Wikimedia Commons, Wiktionary, Wikibooks, Wikisource, Wikiquote, Wikispecies, Wikinews, and Wikiversity. It also organizes and supports development of the MediaWiki open source software.

The names of Wikimedia’s projects are trademarked. The word “wiki” isn’t: anyone can use it. Wikileaks and most other projects with “wiki” in their name have no relationship with us. If you see news organizations making this error, please email them or post a comment pointing to this blog post.

Encyclopedia of Life curates Wikipedia’s species articles

There are more than 1.9 million animals, plants, and other forms of life on Earth. In May 2007, some of the world’s leading scientists announced the development of the Encyclopedia of Life (EOL) to document them all. Inspired by biologist E. O. Wilson’s TED Wish and supported by more than $25 million in funding, the project aggregates and makes accessible information about species ranging from 19th century journals to modern online databases.

See the page about Solanum lycopersicum, the garden tomato, as an example. Much of the information comes from Solanaceae Source, a specialized source of  names lists, species descriptions, specimen collections and publication lists for the genus Solanum. The Biodiversity Heritage Library provides historical public domain texts about the species from various published journals. Many other specialized and general resources contribute to the overall species page.

A Wikipedia article included in an Encyclopedia of Life species page. The yellow background indicates that no curator has reviewed the content yet. Click the image to enlarge.

You’ll also find a “Wikipedia” entry in the table of contents. It reveals a copy of the Wikipedia article about tomatoes. As of this writing, the article text has a yellow background.

This means that an Encyclopedia of Life curator has not yet reviewed the content for inclusion in EOL. An EOL species page can have one or more curators who select and validate information added to EOL pages. Wikipedia articles, where they exist, are included by default.

Once the article has been validated by a curator, the yellow background is removed. The information for curators and curation standards pages on EOL give some additional background on the curation process, which applies to all content objects in EOL. Specific guidelines have been written for curation of content from Wikipedia and Wikimedia Commons. We’re particularly pleased that EOL encourages its curators to improve Wikipedia directly if errors or omissions are found.

So far, more than 200 Wikipedia articles have been reviewed through this process. Reviewers classify the information as follows:

  • ‘trusted’ – reviewed by curator and not deemed to contain substantially incorrect information
  • ‘untrusted’ – reviewed by curator and deemed to include incorrect or unverifiable information
  • ‘inappropriate’ – reviewed by curator and deemed to not be eligible for inclusion in EOL for other reasons (e.g. too short to add value)

EOL makes the entirety of all review information (who reviewed what when, with what outcome) available through an Atom feed. This means that Wikipedians, and others, can use this information easily in the development of new applications.

The book creator tool makes it possible to order a printed and bound book from any Wikipedia article selection. A custom cover can be chosen. Nautilus photograph by Lee Berger, Creative Commons Attribution/Share-Alike License. (Click to enlarge.)

A proof-of-concept for expert reviews

Magnus Manske is a biochemist and programmer at the Sanger Institute in the United Kingdom. He is also a long-time Wikimedia volunteer, and wrote the first version of the PHP software used by Wikipedia, which later became MediaWiki. As a scientist, Magnus has advocated for the scientific community to use and improve Wikipedia, most recently as co-author of the paper Ten Simple Rules for Editing Wikipedia.

I informed Magnus about the new EOL review information, and suggested that we might want to explore using this information to generated printed books or PDF collections of reviewed articles. The software for exporting Wikipedia articles into books already exists, so it was just a matter of putting two and two together.

So, Magnus used the available data feed to create an automated tool that creates a list of all EOL-reviewed article versions in a form that can be used by Wikipedia’s book tool.

This makes it possible to download a PDF file or order a printed book that only contains EOL-reviewed versions of Wikipedia species articles.

To try it out, visit the page for Magnus’ example book. Click “Download PDF” to generate the (very large) PDF file that contains all the species articles, or “order printed book” to preview or order a printed book from PediaPress (which, as of this month, also offers books in color and hardcover format). If you want to remix or play with the book further, you can click “Open book creator”.

We’re very pleased with this first proof-of-concept, and are grateful to the Encyclopedia of Life team for engaging its community in the curation of Wikipedia articles. Both parties benefit: The Encyclopedia of Life enriches its species pages using the often well-developed Wikipedia content. Wikipedia benefits because EOL’s trusted reviewers add their stamp of approval to Wikipedia articles, which helps Wikipedia readers and editors alike. Where EOL reviewers do not approve, they are encouraged to edit the Wikipedia article.
I asked Bob Corrigan, EOL Product Manager and Acting Deputy Director, to give his take on this project. He writes: “This is definitely a win-win partnership. EOL is focused on providing very deep, structured access to trusted biodiversity information from our network of content partners and curators, and vetted Wikipedia articles can be a terrific gateway to this information. We see a closer relationship with Wikimedia as an important way to expand access to global knowledge about life on Earth.”

Hardcover book made from curated Wikipedia articles. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

Example page from the book. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

A replicable model

Magnus’ implementation was already created with an eye to future extensibility. If you’re inclined to take a closer technical look, check out Magnus’ “Sifter-Books” script which generates the book data, and can potentially support multiple partner institutions/organizations providing article reviews. As of the time of this writing, Magnus has already added two additional groups who review Wikipedia articles, Rfam and Pfam, databases of RNA and protein families.

Moreover, Magnus has written a small proof-of–concept script which makes the existence of reviews visible on Wikipedia itself. You need to create a user account on the English Wikipedia and follow the installation instructions to use the script. Once installed, a “Reviews” tab will indicate available article reviews.

We look forward to exploring similar partnerships with subject-matter experts in institutions (like universities and libraries), scientific associations, and specialized knowledge communities. If you’re interested in this model, drop me a note (erik at wikimedia dot org).

Erik Moeller
Deputy Director, Wikimedia Foundation
Representative of Wikimedia in the Encyclopedia of Life Institutional Council

Musopen: Returning music to the public domain

One might think that a recording of Beethoven’s or Schumann’s music is in the public domain, free for anyone to share and enjoy, but that’s only the case if the recording artists decide to make their specific performance freely available. Most recordings of classical music are, in fact, copyrighted, and can’t be used without permission. Musopen is an independent charitable organization that’s recording music in the public domain, and making the recordings freely available as public domain works as well.

Now, Musopen is raising funds via the Kickstarter platform: Set Music Free. They’ve exceeded their $10,000 goal, but chipping in additional funds can only help. They are also looking for votes in the Pepsi Refresh Project, which could get them a $25,000 grant.

Wikimedia and other free culture projects benefit from these recordings: There are already more than 100 Musopen recordings of public domain music in Wikimedia Commons, used in more than 45 different Wikimedia projects. We wish Musopen success. Their work will help keep classical music alive.

See also: EFF coverage

Usability: Why Did We Move The Search Box?

On May 13th, we changed the default appearance of the English Wikipedia to use the new look developed as part of the Wikimedia Usability Initiative. On June 9th, we unveiled the new look in the remaining top 9 languages (by access volume). Other languages will follow in the coming weeks.

The key elements of the new design had been in public beta testing for many months, and hundreds of thousands of users had already adopted the new look. But, nothing compares to the real thing, and we tried to make the switch as painless as possible — by offering a quick way back to the old layout, by explaining our reasoning, observing and listening to comments carefully, fixing bugs and implementing changes quickly.

The single most frequently expressed concern about the changes we’ve made is the relocation of the search box from the left sidebar to the top right corner. This blog post will give an extended explanation of why we made the change, the other changes we made to the search, and what we’re planning to do next.

The old search box location

The default location of the search box in MediaWiki, the software used by Wikipedia, is below the “navigation” box in the top left corner. This was also the location in the English language Wikipedia, as well as many other language editions. Some language editions, including the German one, had customized the location of the search box, and moved it directly below the logo.

What do we know about search usability?

There are essentially three factors that influenced our decision to relocate the search box:

  • common user expectations regarding the placement of the search box on web pages, as determined by the preexisting body of usability research;
  • usability research regarding ideal search box width, and implications for the search box placement in our layout;
  • ability of our test subjects to locate and use the Wikipedia search box, as determined by Wikimedia usability tests in a research lab.

There are several scientific studies that have examined the ideal placement of common objects on web pages. One early study by Michael Bernard conducted in 2001 by surveying participants regarding the expected placement of web objects such as internal links, external links, and search found that both new and experienced web users “generally expected internal search engines to be located in the upper and bottom-center of a web page. A smaller number expected it to be located at the top right of the page.”

This study was followed up five years later by A. Dawn Shaikh and Keisi Lenz (”Where’s the Search? Re-examining User Expectations of Web Objects”) in a survey of 142 participants. The study found that expectations had changed significantly, especially regarding the placement of the site search engine. The figure below illustrates the areas where participants expected the search to be found:

Expected location of site search engine

As the authors speculate and as seems intuitively plausible, early expectations of the placement of the search box were likely driven by the fact that search was commonly associated only with search engines of the time like AltaVista, not with site-specific searches. As more and more sites developed internal search functions, those were increasingly placed in slightly less exclusive screen real estate than the top center, shifting users’ expectations to look for search features in the top right corner.

Another factor that may have influenced user expectations is the common placement of search engine features in the top right corner of the web browser window.

There are practical advantages of positioning the search in the top right. As summarized in this research paper, several usability studies have pointed out a key advantage of navigational elements being placed on the right: it gives immediate access to the browser scrollbar. This is particularly valuable when a) scrolling up and down a list of search results, b) scrolling up and down an article you’ve just called up for information.

Search box width, and placement implications

A separate body of research examines the question what width makes a search box user-friendly. A search box that is too narrow obscures the user’s query while typing, inhibiting their ability to complete their search quickly. Usability luminary Jakob Nielsen recommends an ideal width of 27 characters.

The old search box is approximately 20 characters wide, the new search box accommodates 24 characters. More importantly, due to the placement of the old search box in the sidebar of the layout, widening the search was impossible without either relocating it or widening the sidebar.

The search box placement in the top right allows us to maintain a fixed standard width from one page to the next, while giving us maximum flexibility as to what that width should be. To make it even easier for users, we are experimenting with an expandable search, which is currently deployed in our sandbox 3. When you click the box, it will expand significantly to the left.  We may or may not end up deploying this feature as we continue to look at ways to make search more accessible and user-friendly.

Our own research

In the course of the usability and user experience work since last year, we have so far completed a total of three usability studies, all of which are documented on the usability wiki:

These studies included both remote and San Francisco based participants. While the primary focus of our studies were obstacles people encountered when editing, finding search in the navigation was clearly one of them, and our test subjects tended to resort to common web search engines to navigate Wikipedia instead of using the site’s own search. With the new search box placement, users’ ability to find and use the site search was markedly improved.  One user intuitively used the search box in its new location and then consciously realized that it had been moved.  To see videos of the other subjects finding and using the search box with ease, please see here.

For those unfamiliar with usability testing, it’s important to note that small samples and agile, iterative tests are commonly understood to be an effective method for discovering most key user interface issues. Our sample sizes were actually larger than strictly necessary, and more diverse than typical due to our use of remote testing methods.

With that said, we didn’t test the English Wikipedia against other languages which had placed the search box directly below the logo, and we recognize that this alternative placement is already an improvement to match user expectations. However, based on the cited research above, as well as the design reasons for moving the search box to the top right, we still believe that the overall case for moving the search is compelling even for those languages, if slightly less so.

So .. why did you move the search box? I liked it where it was!

In sum, we moved the search box to a) match web practices and user expectations, b) make it possible to widen it consistent with common usability recommendations, c) in response to actual observed problems of test subjects when using the old search.

We also recognize that millions of Wikipedia users had adjusted to the old placement, and will now have to re-adjust to the new placement. However, Wikipedia’s global audience grows by tens of millions of users every year (it is currently at 375 million unique visitors/month world-wide), and we hope to grow it by hundreds of millions in this decade. That will require that we adapt to common user expectations, rather than expecting every new user to adapt to us.

This will unfortunately inconvenience those who have adapted to the old placement. Do we absolutely know that to be the correct decision? No, but the fact that existing users are temporarily inconvenienced by it is not at all indicative that it is not.

Other search changes we made

It’s worth noting that the search box placement isn’t the only thing we changed about the search function. Perhaps most notably, the old search had two buttons (”Go” and “Search” in English). If you asked even an experienced user what the difference between those buttons was, you would get wildly different answers, and bug 577 had been open since 2004 because of this.

To answer the mystery: the “Go” button attempts to find an article with the same title as the entered search term and, if it fails, runs a full-text search of all articles.  “Search” will always run the full-text search.  “Search” is necessary where you want to search for a word instead of displaying the article of that title (say, you want to search for instances of “George W. Bush” all across Wikipedia).

In the new design, the less common case (search all across Wikipedia for a phrase, regardless of exact match) can be accessed using the “containing …” option in the drop-down menu. We believe this is both a more discoverable implementation, and it reduces overall clutter and complexity of the search.

Measures and coming changes

We are monitoring overall search volume. In the first week since the deployment, we have observed neither a statistically significant increase nor a decrease in search volume, but it’s too early to draw conclusions. There are also confounding variables. As noted above, the search box has changed not just in placement, but also in appearance and behavior. Finally, search volume isn’t the only interesting metric: search convenience (how long does it take users to, on average, find the search) is another one.

We’ll try to get our hands on solid metrics, but we’ll also continue to look for ways to make search more user-friendly (such as the auto-expansion), fix bugs, and so forth. In continuing our efforts to improve the user experience of all our projects, both for new and experienced users,  we’ll try to share our thoughts with you frequently, and work with you to figure out the right answer. And, if you just can’t get used to the new search — you can always switch back to the old layout, which will continue to be there for you.

Warmly,
the User Experience Team
Wikimedia Foundation

New Reports from November 2008 Survey Released

In November 2008, the Collaborative Creativity Group at UNU-Merit, in partnership with the Wikimedia Foundation, launched the most comprehensive survey of Wikipedia readers and contributors ever conducted. The survey was translated into 20 languages and received more than 170,000 responses. In April 2009, we shared preliminary results of the survey, and in August 2009, a member of the UNU-Merit team presented the survey at Wikimania (slides are available online).

Some key results from the survey have been widely reported, such as the finding that only about 13% of all Wikipedia contributors are female, a gender imbalance that poses a serious challenge to the Wikipedia project. The UNU-Merit group has now published four final reports:
The UNU-Merit team is planning to make a comprehensive final report report available very soon, and to release the full, anonymized raw data later this year.
Erik Moeller
Deputy Director, Wikimedia Foundation

Using Video to Recruit New Wikipedia Editors

How can we recruit even more people to make Wikipedia a richer, deeper learning resource? For one thing, by making it easier to contribute (see our previous announcement). But, we also have to make our readers aware that their help is welcome, and ease them into taking the first steps to improving or creating an article. So, we’re funding the development of a slate of outreach resources such as brochures and videos that help people to get started, some of which target specific audiences like teachers and students.

Our partners are 27 regional Wikimedia chapter organizations, and anyone else who wants to help. Here are two recent examples.

Wikimedia Italia has funded the production of a 7 minute introductory video, “La Wikiguida di Wikipedia”. You can watch it on YouTube (with subtitles) below, or view or download the video in Ogg Theora format. It’s now linked to on every page of the Italian Wikipedia. The video was produced by Christian Biasco, and more videos are planned to be produced later this year.

If you don’t speak Italian, you may be interested in Howcast’s lovely introduction to creating a Wikipedia article, embedded below:

Produced with guidance from Swedish Wikipedia volunteer Lennart Guldbrandsson, it’s a fun and comprehensive intro, and uses Howcast’s powerful “how-to player” to guide viewers through the instructions. Howcast San Francisco, by the way, now resides in the offices previously used by the Wikimedia Foundation, so perhaps they were inspired by forgotten wiki paraphernalia. ;-)

The Wikimedia Foundation didn’t plan or commission these videos, but we’re very happy and grateful that they were made – we believe instructional video resources will be essential as we scale our efforts to recruit new editors. A big thank you to Wikimedia Italia and Howcast for leading by example. Moving forward, we are seeking opportunities to assist and encourage our chapters and individual volunteers in creating these types of outreach resources.

Erik Moeller
Deputy Director, Wikimedia Foundation