Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts by Erik Moeller

Wikimedia Foundation is looking for a Vice President of Engineering

Developing and maintaining the code and infrastructure that enable the global Wikimedia volunteer community to contribute to Wikipedia and our other projects is at the heart of the Wikimedia Foundation’s work. In the past 2.5 years, I’ve led our combined Engineering and Product department. We’ve done lots of hiring and grown the department from roughly 35 to 100+ people during that time period. We have many projects underway which we hope will dramatically improve the experience of our contributors and our readers, including VisualEditor, Flow (a new discussion system), and new reader and contribution features for mobile users.

About a year ago, I announced that we needed to start thinking about dividing the responsibilities of the VP of Engineering and VP of Product into two separate leadership roles (closely collaborating on a day-to-day basis), at which point I’d focus on the VP Product part of my current role. The Director-level roles referenced in that announcement now exist—we’ve since hired a Director of User Experience and a Director of Analytics. Now it’s time for us to search for a VP of Engineering to complete the change. We’re partnering in this search with Julie Locke from Vantage Partners, an executive search firm.

We’re looking for someone who shares some fundamental beliefs with us:

  • that working in partnership with a global community of open source developers, and in close dialog with our users, is the best way to achieve lasting and positive changes to our technology;
  • that teams do their best work when they’re inspired and empowered to do good work, not because they’re “managed” to do it;
  • that it’s the job of management to create the conditions for teams and individuals to succeed, by equipping them with resources, mentoring and supporting them in their adoption of effective processes for self-organization;
  • that highly iterative development (“release early, release often”) delivers the most value to our readers and our community;
  • that hiring for diversity—of geography, gender, culture, skills, etc.—leads to more successful and effective teams.

Ideally, you’ve put these beliefs into practice in the real world, in a context where you’ve delivered open source technology to users with short delivery/deployment cycles, where you’ve supported operation of a high traffic site reaching millions of users, and where you’ve held leadership responsibilities in service of multiple, diverse, interdependent teams. You’re passionate about open source, and above all, you’re excited by the Wikimedia vision: a world in which every single human being can freely share in the sum of all knowledge.

Wikimedia has great technical challenges ahead: continually modernizing our user experience, decoupling monolithic aspects of our architecture, and supporting the greatest innovations our community comes up with. We’re looking for a collaborative, brilliant and effective leader who can help us tackle these challenges. If that describes you, take a look at the full job description, and apply today.

Erik Moeller
Vice President of Engineering and Product Development, Wikimedia Foundation

Remembering Aaron Swartz (1986-2013)

Aaron Swartz at a Boston Wikipedia meetup in 2009

Aaron Swartz was found dead in his New York apartment Friday, an apparent suicide. Aaron was a prolific hacker and a free culture activist. He was also a Wikipedian. Today, the Internet community at large is reeling from Aaron’s early death, and Wikimedia is joining in remembering an extraordinary individual.

In 2000, as a 13-year-old, he was the youngest finalist in a teen website competition with his project “The Info Network”, an online encyclopedia inviting anyone to contribute their knowledge. Aaron would later recall that while he was not able to find enough contributors for his first web site, “luckily, several years later, my mother pointed me to this new site called ‘Wikipedia’ that was doing the same thing.”

At age 14, Aaron co-authored RSS 1.0, an important web standard. Later he founded Infogami, a startup which would merge with Reddit, which today is one of the most influential social news sites. He led the development of the Open Library, a project launched by the non-profit Internet Archive in 2007 with the aim of offering “one web page for every book”, integrating user contributions through a wiki interface.

In 2003 he started editing Wikipedia. His userpage lists more than 200 articles he started or contributed a large amount of content to. His most recent edit was on Thursday, January 10.

In 2006, he was a candidate for the Wikimedia Foundation Board of Trustees, which in part is elected by the Wikimedia community. It was during that time that he wrote a series of essays about Wikipedia, sharing his concerns, hopes and dreams for the project’s future.

This included “Who writes Wikipedia”, which proposed that the role of casual contributors to the encyclopedia is often severely underestimated, and that protecting the encyclopedia’s fundamentally open nature was critical to its future. “If Wikipedia continues down this path of focusing on the encyclopedia at the expense of the wiki, it might end up not being much of either,” Aaron wrote. His essay triggered a debate and research that continues to this day.

In recent years, Aaron’s focus was on online activism. He believed strongly that the freedoms that we take for granted online are constantly under threat and need to be defended. To this end, he co-founded Demand Progress, and was one of the leaders in the grass-roots campaign against legislation known as SOPA and PIPA, a campaign which Wikipedia participated in through the 2012 Wikipedia blackout. Aaron’s keynote at the Freedom to Connect conference in 2012 re-tells the important story of how SOPA and PIPA were ultimately defeated.

Aaron also strongly believed that the public should have free access to the laws that govern it, and to publicly funded scholarship and scientific research. In 2011, he was indicted for allegedly breaking into MIT’s network to download large amounts of scholarly materials.

Family, friends and those close to the case have raised questions about the fervor and zeal with which Aaron was pursued — Lawrence Lessig’s post “Prosecutor as bully” provides some important background, as does expert witness Alex Stamos’ summary.

Whatever caused Aaron to take his own life, it is a shocking and painful loss of an extraordinary individual who has touched so many through his ideas and actions. His friends and family have started an online memorial to share remembrance stories, and Wikipedians are also leaving comments on his talk page. We join them in remembering Aaron Swartz, a beautiful human being.

Further reading:

1 million media files uploaded using Upload Wizard

In May 2011, we announced a new way to share pictures, sounds, and video: the Upload Wizard. A year later, Upload Wizard has been used to upload more than 1 million freely licensed media files and has contributed to an acceleration of growth of the Wikimedia Commons community.

Countering the decline in retention of new contributors to Wikipedia, the number of contributors to Wikimedia Commons (individuals who make at least one upload) grew by about 25% from March 2011 to March 2012, compared with ~12% in the prior year. We attribute this growth primarily to two factors: the introduction of the Upload Wizard, and the successful “Wiki Loves Monuments” competition in September 2011, highlighted on the graph below.

Wikimedia Commons uploader statistics 2011-2012.png

(more…)

Download the text of the entire English Wikipedia

If you’d like to read Wikipedia in an airplane (of the offline variety) or in an area with no or limited connectivity, or install it in a university, or just to have it handy in case of a zombie apocalypse, you can now download a full text copy of the English Wikipedia (from January 2012) in the convenient OpenZIM format, which was specifically developed for sharing wiki content.

OpenZIM files can be read in multiple reader applications, the most popular of which is Kiwix, available for Mac, Windows, Linux, and Sugar.

Start your BitTorrent client and grab a copy of the 9.7GB file (.torrent link, other download options). You can also download content packages directly from within Kiwix using its library feature, including content from sister projects like Wiktionary and Wikisource, as well as non-Wikimedia content.

While the ZIM file doesn’t include images (that would blow it up to ~100GB for thumbnail-sized images), it does come with all the lists, tables, citations, and even mathematical formulas included in the online version.

Wikimedia content has always been made available under free and open licensing terms in raw copies, but ZIM content packages offer a higher level of convenience for the end user.

Please note that this OpenZIM file was prepared by Emmanuel Engelhart, the developer of Kiwix, and feedback should be directed to him (contact at kiwix dot org) or submitted through the Kiwix feedback system.

October 2011 Coding Challenge winners

In October 2011, we tried a new experiment in attracting volunteer developers and advertising opportunities to get involved with Wikimedia’s open source codebase. The October 2011 Coding Challenge invited developers to submit projects in three categories:

  • Mobile Wikipedia: Uploading images and other media via your smartphone
  • Slideshows: Showcase Wikipedia’s beautiful multimedia
  • Realtime: Surface changes to Wikipedia’s content more dynamically

While we received lots of interesting submissions in all three categories, ultimately we had to pick three winners. The grand prize winners in each category get to attend a 2012 Wikimedia event of their choice at our expense. Two runners-up in each category are receiving a certificate of excellence acknowledging the high quality of their submission.

(more…)

Announcing the October 2011 Coding Challenge

Coding Challenge LogoGreat programmers drive any successful tech organization, and great programmers can be hard to find.Fortunately, the Wikimedia Foundation has a unique advantage: millions of unique visitors, every single day.It’s Wikipedia’s global impact which has enabled us to mobilize hundreds of thousands of donors every year to support our mission in our annual fundraising campaign.

We wanted to find out if we could find some of the world’s best programmers using similar means, to cultivate potential future volunteers and job candidates.  Thus, the idea of the Coding Challenge was born.

This is, admittedly, an experiment, and we’re not entirely sure how it will go.  We’ve structured it like a contest, and contests can be tricky, because they have rules.  One of the most important rules: only one contestant per challenge can win The Grand Prize (an all-expenses paid trip to an eligible Wikimedia event).   We run the risk of creating an ultra-competitive environment, in which people care more about winning than about helping to build a better Wikipedia.

Let’s reiterate, then: the goal of the contest is to find the best programmers — and at WMF, we get to decide exactly what that means.  (See the part in the rules about “sole discretion”.)  And a great programmer must write great code, of course — but the greatest programmers know that it means a lot more than that, too.  Is the documentation good?  Have ideas been exchanged on the wikitech-l mailing list?  Were participants generous with their time and ideas?  Do we see those ideas proliferating through the code of others?

There’s also a lot of value in this contest even to those who don’t “win”.  Maybe there’s only one grand prize per challenge, but WMF can, and will, bestow accolades to everyone who writes good code and shares it with everyone else.  And those things matter: when a potential employer comes to call, and you can point to the nice folks at Wikipedia to vouch for your work, that’s no small thing.

There are great potential programmers all over the globe.  If we can convince even a small fraction of you to accept one of these challenges, we will consider this little experiment a resounding success.

The Coding Challenge will run until November 7, 2011, 23:59 UTC.

Greg DeKoenigsberg, Coding Challenge Coordinator
Erik Moeller, VP of Engineering and Product Development

Expanded Use of Article Feedback Tool

Today on English Wikipedia we rolled out the Article Feedback Tool – previously featured on 3,000 English Wikipedia articles – to a larger set of 100,000 articles. This initial expansion is intended to further assess both the tool’s value and its performance characteristics, with an eye to a full deployment on Wikipedia and potentially other projects.

Some examples of articles that currently feature the tool (at bottom):

The intent of the tool is two-fold:

  • to gain aggregate quality assessments of Wikimedia content by readers and editors;
  • to use as an entry vector for other forms of engagement.

To assess its value in both categories, we’ve already undertaken a significant amount of qualitative and quantitative research. You can read an extensive summary of our work so far here.

The high level summary based on the data we’ve seen so far: We believe user ratings can be a valuable way to predict high and low quality content in Wikimedia, and we’re especially interested in engaging raters beyond the initial act of assessing an article. Through our trials to date, we’ve seen very good conversion rates on the calls-to-action that follow a rating, suggesting that this could be a powerful engagement tool as well.

Beyond continuing our own research and these engagement experiments, our goal is to regularly make available anonymized data from the tool, and to supply editors with a dashboard tool for surfacing trends in the rating data. We’re looking forward to sharing wider findings from the use of the tool soon.

Please use the talk page or comment below for feedback, questions and suggestions.

Erik Moeller, Deputy Director

A new way to share pictures, sounds and video

UploadWizard uploading multiple files

On April 15, Wikimedia Commons celebrated its 10 millionth media file. A new feature will help to increase that number even faster. The upload wizard, which entered public beta in late November and has been used to upload more than 10,000 files already, is now the default upload tool on Wikimedia Commons. Use it and tell us what you think, as we continue to improve it.

Here’s what’s different:

  • Instead of overloading the user interface with information about licensing and acceptable content, there’s a single comic strip tutorial explaining the licensing policy, which can be dismissed after the first time you see it.
  • You only see complexity when you need to see it. There are sensible defaults for licensing, automatic metadata extraction from the uploaded files, etc.
  • You can upload up to 10 files as a batch, instead of having to upload each file individually. You can see thumbnails of the files you’re uploading, and abort any individual upload.
  • Error cases should be handled in a clear and understandable fashion, and guide the user towards the most sensible action (e.g. when a file needs to be renamed, the upload shouldn’t fail: instead, the tool will prompt that a rename is necessary).
  • As a final step, the UploadWizard explains how to add uploaded files to pages in Wikimedia projects.

And here’s what some of our experienced users have said during the beta:

  • “The upload wizard provides a much less cluttered and confusing upload process.”
  • “Great performance from the upload wizard. A lot of the more tiresome details are filled out automatically”
  • “Fantastic wizard makes process clearer, but please keep the old form for more experienced users. Thanks!”
  • “I never thought the old uploading process was too hard, but this new upload wizard is amazingly simple. It actually makes me want to upload more.”
  • “Much improved method of uploading files. Multiple file uploads, auto filling of dates, user name, etc, simplifies license input, all help to reduce time required to upload. Great work.”

The UploadWizard requires JavaScript (if JavaScript is disabled, you’ll get a simplified upload form instead). It’s been fully translated into Dutch, French, Galician, German, Hebrew, Indonesian, Interlingua, Macedonian, Malayalam, Portuguese, Russian, Slovenian,Tagalog, and Vietnamese (call for translations). Tell us what you think — and remember, if it doesn’t work for you, you can always go back to the old form. In the coming weeks, we’ll not only examine the impact that this new tool will have on the overall number of media uploads, but also whether it will lead to a larger percentage of deleted content (due to lower quality uploads). We will continue to improve the tool as we learn more.

Big thanks to the UploadWizard team — Neil Kandalgaonkar, Ryan Kaldari, Guillaume Paumier, Alolita Sharma — and to all code reviewers, operations engineers, translators and testers for their work on this project so far. We hope that you’ll enjoy the new upload experience. If you have images, sound files or videos with educational value that you’re willing to donate to the world, now is a good time to do it.

Erik Moeller, Deputy Director

What’s in a name? In the case of ‘wiki’, lots of things.

Anyone who’s been watching the news will have heard about Wikileaks by now. Wikipedia shares the generic “wiki-” prefix in its name, but there’s no relation. Occasionally even major news sources like the BBC get this wrong, which can lead to serious confusion, even when it’s quickly fixed.

If anyone has a claim to the word “wiki”, it would be the Hawaiian people. In the Hawaiian language, wiki means “quick”. The words “wiki wiki” on a shuttle bus in Honolulu inspired software engineer Ward Cunningham to name a revolutionary piece of software – the “WikiWikiWeb” – in 1995. This software allowed people to instantly edit web pages, collaboratively.

Wikipedia was created six years later, based on the same principles. By that time, the word “wiki” was used already by a ton of different wiki software implementations. Today, you can go to the “WikiMatrix” website to compare them all. They have names like Wikidot, TWiki, or Wikispaces. Moreover, there are many, many content websites that use “wiki” in their names. Among them are Wikihow, Wikitravel, WikiAnswers, and Wikia.

Most of these projects are completely unrelated to Wikipedia. Wikipedia is operated by the non-profit Wikimedia Foundation, which was founded by Jimmy Wales in 2003. The Wikimedia Foundation operates a number of other free knowledge projects: Wikimedia Commons, Wiktionary, Wikibooks, Wikisource, Wikiquote, Wikispecies, Wikinews, and Wikiversity. It also organizes and supports development of the MediaWiki open source software.

The names of Wikimedia’s projects are trademarked. The word “wiki” isn’t: anyone can use it. Wikileaks and most other projects with “wiki” in their name have no relationship with us. If you see news organizations making this error, please email them or post a comment pointing to this blog post.

Encyclopedia of Life curates Wikipedia’s species articles

There are more than 1.9 million animals, plants, and other forms of life on Earth. In May 2007, some of the world’s leading scientists announced the development of the Encyclopedia of Life (EOL) to document them all. Inspired by biologist E. O. Wilson’s TED Wish and supported by more than $25 million in funding, the project aggregates and makes accessible information about species ranging from 19th century journals to modern online databases.

See the page about Solanum lycopersicum, the garden tomato, as an example. Much of the information comes from Solanaceae Source, a specialized source of  names lists, species descriptions, specimen collections and publication lists for the genus Solanum. The Biodiversity Heritage Library provides historical public domain texts about the species from various published journals. Many other specialized and general resources contribute to the overall species page.

A Wikipedia article included in an Encyclopedia of Life species page. The yellow background indicates that no curator has reviewed the content yet. Click the image to enlarge.

You’ll also find a “Wikipedia” entry in the table of contents. It reveals a copy of the Wikipedia article about tomatoes. As of this writing, the article text has a yellow background.

This means that an Encyclopedia of Life curator has not yet reviewed the content for inclusion in EOL. An EOL species page can have one or more curators who select and validate information added to EOL pages. Wikipedia articles, where they exist, are included by default.

Once the article has been validated by a curator, the yellow background is removed. The information for curators and curation standards pages on EOL give some additional background on the curation process, which applies to all content objects in EOL. Specific guidelines have been written for curation of content from Wikipedia and Wikimedia Commons. We’re particularly pleased that EOL encourages its curators to improve Wikipedia directly if errors or omissions are found.

So far, more than 200 Wikipedia articles have been reviewed through this process. Reviewers classify the information as follows:

  • ‘trusted’ – reviewed by curator and not deemed to contain substantially incorrect information
  • ‘untrusted’ – reviewed by curator and deemed to include incorrect or unverifiable information
  • ‘inappropriate’ – reviewed by curator and deemed to not be eligible for inclusion in EOL for other reasons (e.g. too short to add value)

EOL makes the entirety of all review information (who reviewed what when, with what outcome) available through an Atom feed. This means that Wikipedians, and others, can use this information easily in the development of new applications.

The book creator tool makes it possible to order a printed and bound book from any Wikipedia article selection. A custom cover can be chosen. Nautilus photograph by Lee Berger, Creative Commons Attribution/Share-Alike License. (Click to enlarge.)

A proof-of-concept for expert reviews

Magnus Manske is a biochemist and programmer at the Sanger Institute in the United Kingdom. He is also a long-time Wikimedia volunteer, and wrote the first version of the PHP software used by Wikipedia, which later became MediaWiki. As a scientist, Magnus has advocated for the scientific community to use and improve Wikipedia, most recently as co-author of the paper Ten Simple Rules for Editing Wikipedia.

I informed Magnus about the new EOL review information, and suggested that we might want to explore using this information to generated printed books or PDF collections of reviewed articles. The software for exporting Wikipedia articles into books already exists, so it was just a matter of putting two and two together.

So, Magnus used the available data feed to create an automated tool that creates a list of all EOL-reviewed article versions in a form that can be used by Wikipedia’s book tool.

This makes it possible to download a PDF file or order a printed book that only contains EOL-reviewed versions of Wikipedia species articles.

To try it out, visit the page for Magnus’ example book. Click “Download PDF” to generate the (very large) PDF file that contains all the species articles, or “order printed book” to preview or order a printed book from PediaPress (which, as of this month, also offers books in color and hardcover format). If you want to remix or play with the book further, you can click “Open book creator”.

We’re very pleased with this first proof-of-concept, and are grateful to the Encyclopedia of Life team for engaging its community in the curation of Wikipedia articles. Both parties benefit: The Encyclopedia of Life enriches its species pages using the often well-developed Wikipedia content. Wikipedia benefits because EOL’s trusted reviewers add their stamp of approval to Wikipedia articles, which helps Wikipedia readers and editors alike. Where EOL reviewers do not approve, they are encouraged to edit the Wikipedia article.
I asked Bob Corrigan, EOL Product Manager and Acting Deputy Director, to give his take on this project. He writes: “This is definitely a win-win partnership. EOL is focused on providing very deep, structured access to trusted biodiversity information from our network of content partners and curators, and vetted Wikipedia articles can be a terrific gateway to this information. We see a closer relationship with Wikimedia as an important way to expand access to global knowledge about life on Earth.”

Hardcover book made from curated Wikipedia articles. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

Example page from the book. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

A replicable model

Magnus’ implementation was already created with an eye to future extensibility. If you’re inclined to take a closer technical look, check out Magnus’ “Sifter-Books” script which generates the book data, and can potentially support multiple partner institutions/organizations providing article reviews. As of the time of this writing, Magnus has already added two additional groups who review Wikipedia articles, Rfam and Pfam, databases of RNA and protein families.

Moreover, Magnus has written a small proof-of–concept script which makes the existence of reviews visible on Wikipedia itself. You need to create a user account on the English Wikipedia and follow the installation instructions to use the script. Once installed, a “Reviews” tab will indicate available article reviews.

We look forward to exploring similar partnerships with subject-matter experts in institutions (like universities and libraries), scientific associations, and specialized knowledge communities. If you’re interested in this model, drop me a note (erik at wikimedia dot org).

Erik Moeller
Deputy Director, Wikimedia Foundation
Representative of Wikimedia in the Encyclopedia of Life Institutional Council