Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Offline

Usability testing improves Kiwix user experience

During the recent Berlin hackathon in May, Wikimedia Developer Ryan Kaldari and Lead Kiwix Developer Emmanuel Engelhart led a usability study to better understand how to improve the user experience of the offline Wikipedia app Kiwix

We were inspired by a presentation that Trevor Parscal did last year which showcased how easy it is to run a usability study.

With the help of Sumana Harihareswara and numerous others, we conducted seven interviews that highlighted some of the pain points our users were facing.

Some of the quick observations were:

  • Bookmarks are too complicated;
  • Tabs are not intuitive;
  • Some common command key combinations are not supported.

The test script and full results are available, and we’re now using what we learned to guide our next development sprints.

Some of the issues have already been resolved, as they were either in development or quick fixes, while others will require more research.

All the tests were recorded and the videos are already available on Wikimedia Commons.

We’d like to thank our testers who helped us immensely!

It was also great to see how easy it is to run such a study. We have many great opportunities to do research like this at meet-ups, hackathons, conferences, Wikimania, etc.

I’d love to see our community do more informal testing sessions; running just one in a geographic region would quickly surface issues our users are facing.

Are you interested? Don’t wait! Do your own and let us know how it went, or leave a comment below if you want more information.

Tomasz Finc
Director of Mobile and Special Projects

 

Update on Offline Wikipedia Projects

The last week was a big week for expanding offline Wikipedia work.

Right now, offline refers to supporting read access to Wikimedia content without an Internet connection.  This increases the reach of the Wikipedia movement by providing more opportunities for people all over the world to access the materials.  Some of the recent initiatives surrounding this project were documented in Wikimedia’s tech blog about a month ago (for more detail regarding the purpose for offline work, see the offline strategy page).

In support of our offline readership work, we’re thrilled to announce the launch of a new feature on Wikipedia developed with our partners from PediaPress.  Last week we enabled ZIM export (the main file format in which offline materials are stored) for the existing PediaPress collections extension on English Wikipedia and numerous other wikis.  This means that individuals can now use the existing PediaPress Create a book tool and download it in a format which can be read offline (via an offline reader, such as Kiwix).  This is important because it opens new avenues for the creation of offline materials, for example, an openZim library hosting different offline “book” options.

Also, the English offline collection Wikipedia 0.8 was made officially available, after much hard work by the Wikipedia 1.0 Editorial Team.  This collection is an iteration in the process of developing a vetted collection of offline articles selected based on their quality and topical importance.  The main constraint with an offline product is the data size restrictions: the entirety of Wikipedia must somehow be condensed so that it fits on a CD, DVD, or USB stick.  Wikipedia 1.0 aims at creating the highest quality and most valuable subset of Wikipedia to meet those size requirements, and v0.8 is a precursor.  Wikipedia 0.8 is a general collection of just under 50K articles, It is available for Mac, PC, or Linux with a Linux or Okawix reader; some mobile phone versions will be available later this month as well.

More updates are sure to come on this offline front: Wikimedians around the world are actively assisting in the development of offline collections as well as distribution.  We are excited to support and document the momentum going forward.

Jessie Wild, Global Development

Update on Offline Wikimedia projects

Greetings,

With the annual fundraiser wrapping up, two sections of Wikimedia engineering are going to start moving more quickly: Mobile and Offline. The offline ecosystem has a lot of moving parts and it’s easy to get lost. The Wikimedia Foundation is currently focusing on three main areas of intervention: selection tools, file formats and offline apps.

Right now, “Offline” refers to supporting read access to Wikimedia content without an internet connection; increasing reach was identified during the Wikimedia strategic planning process as one of the movement priorities, and the first recommendation of the Offline task force was to “Simplify reuse of content from WMF projects”.

The first step in making Wikimedia content available offline is to select it. The Wikipedia Version 1.0 Editorial Team has been steadily releasing new versions of their beta Wikipedia collections, but technical limitations have hampered how quickly those can be finished. We’re going to evaluate the team’s tool set to see how to support them.

For example, we’re looking at extending the Wikipedia Release Version Tools to add features like sub-selection and comments (see an example of how the tool works for the Physics project).

Once the content has been selected, it needs to be packaged into a standard file format. The openZim format is an actively developed format for offline Wikipedia content, and we want to facilitate its integration into our general architecture.

Our first step is going to be the enhancement of the Collections extension to support openZim. This will be done by our partners from PediaPress, who have already started to work on it. They will need help from other community members to help test the new openZim files created by the extension.

After selection and packaging, the last remaining piece is the application that allows readers to access the content. Over the last many years, there have been lots of Wikipedia offline apps: BzReader, MzReader, WikiTaxi, WikiFilter, Kiwix, Okawix, etc. Some have come and gone, while others continue to thrive and are actively releasing new updates.

One thing we’ve learned looking at this ecosystem is that there is a strong need for a featured, easy-to-use and well supported offline app.

During the strategic planning process, one app emerged as a good candidate for the WMF to actively support: Kiwix. Kiwix has been around since 2007 and, through the great work of its lead developer Kelson, has steadily improved its feature set, platform support and overall stability.

In order to support this work and to help make the application even easier to use, we’ll be conducting a usability study on Kiwix, focused on search and browse, during the first quarter of 2011. Later this year, we’ll be focusing on an easier update cycle using openZim as the underlying storage format.

We hope 2011 will be full of exciting news about offline Wikimedia content. If you’d like to get involved, please participate in the strategic product discussion about Offline, or contact me if you’d like to help with development.

Tomasz Finc
Engineering Program Manager – Offline, Mobile, & Fundraising

Encyclopedia of Life curates Wikipedia’s species articles

There are more than 1.9 million animals, plants, and other forms of life on Earth. In May 2007, some of the world’s leading scientists announced the development of the Encyclopedia of Life (EOL) to document them all. Inspired by biologist E. O. Wilson’s TED Wish and supported by more than $25 million in funding, the project aggregates and makes accessible information about species ranging from 19th century journals to modern online databases.

See the page about Solanum lycopersicum, the garden tomato, as an example. Much of the information comes from Solanaceae Source, a specialized source of  names lists, species descriptions, specimen collections and publication lists for the genus Solanum. The Biodiversity Heritage Library provides historical public domain texts about the species from various published journals. Many other specialized and general resources contribute to the overall species page.

A Wikipedia article included in an Encyclopedia of Life species page. The yellow background indicates that no curator has reviewed the content yet. Click the image to enlarge.

You’ll also find a “Wikipedia” entry in the table of contents. It reveals a copy of the Wikipedia article about tomatoes. As of this writing, the article text has a yellow background.

This means that an Encyclopedia of Life curator has not yet reviewed the content for inclusion in EOL. An EOL species page can have one or more curators who select and validate information added to EOL pages. Wikipedia articles, where they exist, are included by default.

Once the article has been validated by a curator, the yellow background is removed. The information for curators and curation standards pages on EOL give some additional background on the curation process, which applies to all content objects in EOL. Specific guidelines have been written for curation of content from Wikipedia and Wikimedia Commons. We’re particularly pleased that EOL encourages its curators to improve Wikipedia directly if errors or omissions are found.

So far, more than 200 Wikipedia articles have been reviewed through this process. Reviewers classify the information as follows:

  • ‘trusted’ – reviewed by curator and not deemed to contain substantially incorrect information
  • ‘untrusted’ – reviewed by curator and deemed to include incorrect or unverifiable information
  • ‘inappropriate’ – reviewed by curator and deemed to not be eligible for inclusion in EOL for other reasons (e.g. too short to add value)

EOL makes the entirety of all review information (who reviewed what when, with what outcome) available through an Atom feed. This means that Wikipedians, and others, can use this information easily in the development of new applications.

The book creator tool makes it possible to order a printed and bound book from any Wikipedia article selection. A custom cover can be chosen. Nautilus photograph by Lee Berger, Creative Commons Attribution/Share-Alike License. (Click to enlarge.)

A proof-of-concept for expert reviews

Magnus Manske is a biochemist and programmer at the Sanger Institute in the United Kingdom. He is also a long-time Wikimedia volunteer, and wrote the first version of the PHP software used by Wikipedia, which later became MediaWiki. As a scientist, Magnus has advocated for the scientific community to use and improve Wikipedia, most recently as co-author of the paper Ten Simple Rules for Editing Wikipedia.

I informed Magnus about the new EOL review information, and suggested that we might want to explore using this information to generated printed books or PDF collections of reviewed articles. The software for exporting Wikipedia articles into books already exists, so it was just a matter of putting two and two together.

So, Magnus used the available data feed to create an automated tool that creates a list of all EOL-reviewed article versions in a form that can be used by Wikipedia’s book tool.

This makes it possible to download a PDF file or order a printed book that only contains EOL-reviewed versions of Wikipedia species articles.

To try it out, visit the page for Magnus’ example book. Click “Download PDF” to generate the (very large) PDF file that contains all the species articles, or “order printed book” to preview or order a printed book from PediaPress (which, as of this month, also offers books in color and hardcover format). If you want to remix or play with the book further, you can click “Open book creator”.

We’re very pleased with this first proof-of-concept, and are grateful to the Encyclopedia of Life team for engaging its community in the curation of Wikipedia articles. Both parties benefit: The Encyclopedia of Life enriches its species pages using the often well-developed Wikipedia content. Wikipedia benefits because EOL’s trusted reviewers add their stamp of approval to Wikipedia articles, which helps Wikipedia readers and editors alike. Where EOL reviewers do not approve, they are encouraged to edit the Wikipedia article.
I asked Bob Corrigan, EOL Product Manager and Acting Deputy Director, to give his take on this project. He writes: “This is definitely a win-win partnership. EOL is focused on providing very deep, structured access to trusted biodiversity information from our network of content partners and curators, and vetted Wikipedia articles can be a terrific gateway to this information. We see a closer relationship with Wikimedia as an important way to expand access to global knowledge about life on Earth.”

Hardcover book made from curated Wikipedia articles. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

Example page from the book. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

A replicable model

Magnus’ implementation was already created with an eye to future extensibility. If you’re inclined to take a closer technical look, check out Magnus’ “Sifter-Books” script which generates the book data, and can potentially support multiple partner institutions/organizations providing article reviews. As of the time of this writing, Magnus has already added two additional groups who review Wikipedia articles, Rfam and Pfam, databases of RNA and protein families.

Moreover, Magnus has written a small proof-of–concept script which makes the existence of reviews visible on Wikipedia itself. You need to create a user account on the English Wikipedia and follow the installation instructions to use the script. Once installed, a “Reviews” tab will indicate available article reviews.

We look forward to exploring similar partnerships with subject-matter experts in institutions (like universities and libraries), scientific associations, and specialized knowledge communities. If you’re interested in this model, drop me a note (erik at wikimedia dot org).

Erik Moeller
Deputy Director, Wikimedia Foundation
Representative of Wikimedia in the Encyclopedia of Life Institutional Council

Wikimedia Chapters Work Together to Bring More Free Knowledge to Africa

Next Sunday, 20 Israeli students will leave for humanitarian work in Africa, equipped with portable offline Wikipedia thanks to a coordinated effort between Wikimedias from Israel, Switzerland and France.

Every year, the Africa Center at BGU, headed by Dr. Tamar Golan, sends a group of students on a three-month humanitarian expedition to developing countries in Africa. This year’s group is going to the Republic of Benin and the Republic of Cameroon.

To help, Wikimedia Israel decided to equip the students with computers running free software and containing an offline (static) version of the French Wikipedia, so that the students can bring free knowledge to Africans without access to the Internet. The students also have portable installations of the offline Wikipedia, so that they may install it on any other computers they may run across in Africa

We reached out to Hamakor, the Israeli Free and Open Source Software NGO, and Hamakor helped obtain computer donations, refurbished them and installed the Linux operating system on them.

Wikimedia Israel collaborated with members of Wikimedia Switzerland and Wikimedia France to produce an up-to-date static version of the French Wikipedia (numbering about 1 million entries, and including images), French being a major language of reading and writing in Cameroon and Benin.

Incidentally, the Linux version installed on those computers is called Ubuntu Linux, ‘Ubuntu’ being an African word (in the Zulu language) roughly translated as “unity of mankind” or “mutual reliance”.

We are very excited about this project that continues the Wikimedia Movement mission of supporting and promoting the distribution of free knowledge to everyone in the world.  We can’t wait to hear an update from the students next month.

Itzik Edri

Spokesman, Wikimedia Israel

More Ways to Share

Notice a new feature on the left-hand sidebar today? Now, you can “create a book” and take English Wikipedia with you wherever you go, thanks to the good work of our partner, PediaPress. First launched last year for German language Wikipedia, the feature has been extended to a number of languages, now including English. Initially, this feature was available to logged-in users due to scalability issues, but today, everyone using English Wikipedia can assemble any articles of their choosing into a printed book, a PDF file, or an OpenDocument file for word processing.

To create your book, you can start by clicking on the “create a book” button found on the left-hand sidebar under the “print/export” section. From there, you can add any articles you like while browsing through millions of Wikipedia articles. When you’ve completed your selection, you can further customize your book by creating chapters and a title, choosing a photo for the cover and including an author or editor’s name.

Making Wikipedia available to as many people as possible and providing ways for our volunteer community to enjoy the work that they’ve done is central to our mission here at the Foundation. This is an exciting way to share more.

Moka Pantages

Communications

OpenMoko Launches WikiReader

OpenMoko (Om), a company that previously created an open source smartphone, has just launched The WikiReader, a dedicated reader device with an offline copy of the entire English Wikipedia (without images) stored on a small chip. With two AAA batteries, the WikiReader will run for several months, as it’s been optimized for low power consumption. The device has a simple LCD touchscreen and three buttons for searching, viewing random pages, and looking up previously viewed pages.

Building such a device is possible because, unlike most information on the web, Wikipedia content is freely licensed, allowing anyone to copy, modify, and re-use it for any purpose, including commercial uses. We’ve played with the device and given feedback during the development phase, but it’s not a Wikimedia Foundation product, and we make no guarantees of any kind for its operation.

The device showcases a great opportunity that free educational content creates: information from Wikipedia and similarly licensed projects can be packed into self-contained devices, including purpose-built ones like the WikiReader, without requiring any kind of Internet connectivity. In other words, it is very much possible to get a copy of the most comprehensive encyclopedia in human history to every person on the planet who would benefit from it.

While this device is targeted at least initially at users in the developed world, the software running on the WikiReader is open source, so that other projects can re-use it in whole or in part. (Information about that will go up on their website soon.) We welcome it as a creative new distribution method for Wikipedia content. Congratulations to Om for launching this product; we wish them the best of luck in the marketplace.

Erik Moeller
Deputy Director, Wikimedia Foundation

Wiki-to-print feature activated in six more Wikipedia languages

Yesterday we activated the wiki-to-print feature (see our recent blog post) in six additional Wikipedia language editions: French, Polish, Dutch, Portuguese, Spanish, and Simple English. In these language editions, it’s now possible to make collections of Wikipedia articles, share them, download them as PDF and OpenDocument files, or order them as printed books. We specifically activated it in the Simple English edition (which is a version of Wikipedia written in simple terms for children and adults learning English) so that English language users can get a first good feel for the functionality in a Wikipedia environment (it’s been active in English Wikibooks for a while). We’re hoping for a roll-out in additional languages including English very soon; our main concern is scalability of the feature under the massive load of the English Wikipedia.

The feature has been quickly embraced where it has been activated. In the German Wikipedia, since our deployment on January 27, more than 1,000 custom selections have been created and saved. Our technology partner, PediaPress, has been highly responsive to the rapidly accumulating feedback, and many small and larger output issues have been fixed in the last two weeks. For the new deployments, there’s a central feedback page on Meta.

It will be interesting to see how this feature affects writing on Wikipedia. When people start to think about their contributions in the context of a book, having a consistent structure and style is even more important than when viewing separate Wikipedia articles in a browser. Beyond increasing the quality and reach of our content, we also hope that this technology will be valued by our existing volunteer community as a way to turn their contributions into something that can be touched, held, given away — and by new writers as a motivation to participate.

Erik Moeller
Deputy Director, Wikimedia Foundation

(UPDATE 2/27: We’ve enabled it in the English Wikipedia for signed in users and are observing server load and user feedback. If you’re logged in, see the help page for more information on how to use the tool. As always, the PediaPress team is amazingly responsive to issues that people encounter, and we expect continued improvements to the PDF and print quality over the coming weeks and months. If all goes well, we plan to deploy it on all relevant projects for all users in March. Language support in Chinese, Japanese, Korean, Arabic, Hebrew and some other languages still needs to improve and we won’t enable it in languages that the tool can’t handle appropriately yet – code contributions are welcome!)

Wiki-to-print feature now available in the German Wikipedia

A printed book ordered through PediaPress.com

A printed book ordered through PediaPress.com

A few weeks ago, we rolled out a feature to allow users to generate PDF files, OpenDocument word processor files, and on-demand printed books in one of our smaller sister projects, Wikibooks. This same technology has now also been experimentally enabled on the German Wikipedia (thanks to Frank Schulenburg for creating a beautiful help page). Essentially, you can compile a wiki-book from any number of Wikipedia articles, download a PDF or OpenDocument version, or order a printed version from our technology partner, PediaPress. And if you like your book remixes, you can save them for others to use and share.

If you want to take your favorite Wikipedia articles with you on the go, or if you want to have a nicely formatted PDF version, or you want to edit them further in a word processor, this technology is for you. The reason this is being tested on the German Wikipedia, in case you were wondering, is that PediaPress is a German company, and they will be able to respond quickly to feedback directly from the German Wikipedia community. With more than 1.4 billion pageviews a month, the German Wikipedia is also the second most viewed language edition, right after English with 5.2 billion pageviews. We’ve dedicated some hardware to this feature, and testing it on the German Wikipedia will give us a good idea how it behaves under high traffic characteristics.

It should go without saying that all the code developed through this partnership is open source. In other words, if you want to set up your own wiki with PDF support, OpenDocument support, or connectivity to the PediaPress on-demand printing service, you can install the Collection Extension and enable it on your wiki. When we say free, we mean it.

If all goes well, this feature will become available in all Wikimedia projects where it makes sense. This technology has been developed with the generous support of the Commonwealth of Learning and the Open Society Institute.

Erik Moeller, Deputy Director Wikimedia Foundation

PS: In unrelated tech news, our CTO Brion Vibber has blogged about the AbuseFilter extension, an important tool whose development we’re supporting, which will help Wikipedians to deal more effectively with spam, vandalism, and other destructive user behavior. And if you haven’t seen it, also note his recent post about the Drafts feature that’s being tested, and which should help against accidental loss of edits.