Wikimedia blog

News from inside the Wikimedia Foundation.org

Offline

Grand Prix Wikimedia Brazil: racing towards a better Wikipedia

(For the Portuguese version, please see the Wikimedia Brazil site.)

It was during Wikimania 2011, in a small restaurant in Haifa, when the news was announced: the largest popular computer manufacturer in Brazil, Grupo Positivo, is interested in installing an offline Portuguese Wikipedia version in their products. All of us from Wikimedia Brazil who were present got excited because of the tremendous potential of such a distribution in spreading the free encyclopedia and its mission around Brazil. In other words, this meant the Portuguese Wikipedia for approximately 13% of the national market of personal computers and with a greater penetration in the lower-income strata.

Despite the good news, a race against time began. It was necessary to prepare the offline version of the Portuguese Wikipedia, with 5000 articles of good quality, within a very short time: March 2012. The challenge was huge and to overcome it we needed to step on the gas.

The list of 5000 articles which were critical to include in the offline version was created in only three months, with the great assistance of Wikimedia Brazil volunteers. But the volunteers found that the quality of these articles still was not high enough: they were in desperate need of improvement before being taken offline. It was then we had the idea of hosting our own “Grand Prix” – like the famous auto race. No cars and no laps, but with articles to be improved and many awards for the “pilots” who accept this challenge. Thus began the “I GP Wikimedia Brazil,” where each improved article is a completed lap.

The take-off will begin in January 2012, and it is very easy to attend! Just subscribe to one of the existing teams or join a new team. The registration will last until January 7. At the moment of publishing this blog, we have 51 subscribers divided into 15 teams, but the goal is to have at least 100 participants. After all, this is a Grand Prix where everyone wins!

Prizes will be distributed as teams improve the quality of the articles included in the list. There are buttons, stickers, notebooks and t-shirts with the brand of Wikipedia, as well as trophies and medals on the userpages of the participants. The rules of the award will be released soon after the formation of the teams, but we know that the biggest prize is the offline version of Wikipedia in Portuguese!

Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment. Imagine, now, a Brazil where thousands of people – some of them even without access to Internet – will share a little sum of this knowledge. This is what we will do. Join a team and participate of this Grand Prix too!

(Written by the Wikimedia Brasil Community)

India Hackathon 2011

At the same time as the #WCI11 or the Wikiconference India, there will be a genuine MediaWiki hackathon. The focus of this event will be to crush the technical obstacles that prevent Wikipedia and its sister projects to thrive in India.

This hackathon will be the first held in Asia. Many seasoned developers will be coming to Mumbai to learn first hand what can be done and see what can be done there and then.

To make it a success, there is a Wiki page with our current ideas for the hackathon. The premisses will have rooms for break-out sessions, there will be plenty space, power, internet connectivity, coffee, tea and munchies.

Most important will be that we will be there to learn from you and to show you what we are working on. The Localisation team will be there, the off-line people will be there and the mobile team will be there. We need to meet the many people working on Open Source in India because we want to make sure that whatever we will do fits in with what is already there.

We hope to see you in Mumbai.

Thanks,

Gerard Meijssen
Internationalization / Localization outreach consultant

Kiwix localisation is supported at translatewiki.net

Offline use of Wikimedia content is a strategic goal for the Wikimedia Foundation. Kiwix is an offline app that allows user to read content without an internet connection, and it can now be localized into many languages on translatewiki.net.

There are many instances where people do not have an Internet connection available, or where it is cheaper to work offline, notably in the “Global south”.

Data from Wikimedia projects can be exported to the openZIM format, and then read offline on Kiwix, the only openZIM client.

Several projects with local developers invested a considerable amount of time creating their own offline app for their language, their script or for special requirements like formatting for books.

With the localization of Kiwix on translatewiki.net, it is now much more of an option to work on such features in Kiwix. Customizations like including fonts with a package or having specific formatting for a book or a source remain possible.

We hope our community will help localize Kiwix in the 270+ languages we currently support with Wikimedia projects. Please start translating the interface and let us know how it goes.

Thanks,

Gerard Meijssen
Internationalization / Localization outreach consultant

Come beta test offline Wikipedia

I’m happy to report that we have a new beta version of Kiwix available for testing. For those new to the project, Kiwix is the simplest and easiest way to take Wikipedia with you when you have no internet connection.

We’ve added some features that I’ll talk about below but for those of you that are just looking to get involved: download a fresh copy and give us feedback. Head over to our project pages if you want to see our full roadmap.

With this new beta we have some exciting new features:

  • Mac OS X version;
  • Content Manager
  • Revised search interface

While the majority of our user base is Linux and Windows we didn’t want OSX users to feel left out. It’s now part of our regular build process. Three platform builds per release .. that’s our goal.
We’re especially happy with how the content manager has turned out. Rather than having to scour the internet to find openZim files you’ll now be able to discover new ones right within Kiwix.

We’re starting out with a limited set of data files to simplify our testing, but we’ll be expanding in the next months as we connect the download manager to the Books collection extension. This will greatly expand the amount of content you can download from Wikipedia. With the extra content, we’ll also add filtering capabilities to make sorting easier.

Finally, we’ve tweaked the look and feel of search results. It’s now far more similar to search engine results pages, which will hopefully make both search and browse much easier.There are also lots of others change under the hood and for those curious head over to the change log.

Tomasz Finc
Director Mobile & Special Projects

Brazil beginnings

At the end of June 2011, we had the opportunity to visit Brazil as part of the Wikimedia Foundation’s Brazil Catalyst Project – a project designed to develop open and collaborative approaches by which the Wikimedia Foundation can support the growth of the Wikimedia community in Brazil. Brazil is a priority country for the Wikimedia movement, both for contributions to Portuguese and other Wikimedia projects and for the opportunity to connect with millions of potential readers who are coming online.

Our visit involved a variety of meetings ranging from community gatherings to exploration of business partnerships to presentations at one of the biggest international conferences focused on free software (FISL); the whole agenda and supporting information can be seen on the Brazil Catalyst Project metawiki page. We spent most of the time listening and learning about Brazil and the Brazilian Wikimedia community. It was incredibly valuable to hear from a variety of people and we hope to continue the dialogue.

Sao Paulo: WikiSampa 8
Paulistas have been gathering periodically under the banner of WikiSampa for years, and we were fortunate to get a nice group together on a bank holiday. The group ran the gamut in depth of wiki experience: wiki newcomers sat alongside long time editors,  admins, and community members to share their experiences with Wikipedia and discuss the health and future of Wikipedia in Brazil. We spent about six hours together including a relaxed dinner at a local pizzeria.

As with every community gathering around the world, we were in awe of the positive spirit, dedication and friendliness. We were reminded again that, as Jimmy likes to say, Wikipedians are just “nice people.” We also heard about the struggles in the community. We heard the word “conflito” a lot as the more experienced editors all shared a concern that the Portuguese Wikipedia community has an over-abundance of conflict between editors and that the community needs to find ways to refocus away from fighting. We are not yet clear on the causes of the conflict or if PT:WP is worse that others, but there was a clear sense from those in attendance that they need to find new ways of working together so that new contributors will feel welcome and experienced contributors stay active and energized to continue building a great Wikipedia.

For more pictures see: Category:8_WikiSampa_June_2011, and for the official community page (in Portuguese) see: WikiSampa8.

Rio de Janeiro: first broad community meet-up!

Rio meet-up

Seven Cariocas began what we hope is a regular community gathering in Rio de Janeiro. This group brought fresh faces and minds eager to contribute to the sum of all knowledge. The excitement of the possible future of the RJ community specifically and Brazil at large was palpable: one professor in attendance is now planning to incorporate Wikipedia-editing into a university seminar course! She already has a blog just focused on this experience. A long time Wikipedian and self-proclaimed Wiki-addict met other Wikipedians for the first time and shared his experiences as an editor primarily on English Wikipedia.

Our conversation in RJ focused on the potential of Wikipedia as we had a number of newer community members. They were interested in exploring new ways to bring people into the community. One interesting theme was the prevalence of English. Unlike Sao Paulo, the conversation was in English.  We discussed the fact that a significant number of Brazilians apparently prefer to contribute to English Wikipedia to reach a global audience, even though there is plenty of room for growth of the Portuguese Wikipedia. Some also expressed that Portuguese Wikipedia is considered second class vs. English. We all agreed that having a first class Portuguese Wikipedia is vital to meeting our vision and we took away the question of how to encourage bilingual Brazilians to contribute in Portuguese.

Creating an offline Wikipedia
We had some promising conversations about the potential to distribute offline versions of Wikipedia to people who have computers, but do not have regular access to the Internet. This is a large proportion of Brazilians. We are committed to supporting partnerships to do this, but we need to create a selection of the Portuguese Wikipedia to make available offline. We would love it if community members who were interested in contributing to this initiative would connect with Jessie.

General remarks
These specific meet-ups in addition to other interactions with community in Brazil (in Recife, Campinas, and Porto Alegre) on this trip collectively communicated the great need and potential for mobilization behind the Portuguese Wikipedia within Brazil. While there are great obstacles – negative quality perceptions, low numbers of editors, limited admin support in addition to the fact that some editors prefer to edit the English Wikipedia – opportunities to mobilize existing community and engage a broader Brazilian population seem abundant, and there is no better time than now. We’re excited to continue supporting such a dynamic movement within Brazil and will continue to support and encourage outreach activities designed to further catalyze the collection and dissemination of knowledge within Brazil. We continue to seek more opportunities to hear from Brazilian community members and to learn more about opportunities. We’d also like to thank everyone who helped with the visit and who met with us. Muito Obrigado!

- Barry Newstead, Carolina Rossini, Jessie Wild

Usability testing improves Kiwix user experience

During the recent Berlin hackathon in May, Wikimedia Developer Ryan Kaldari and Lead Kiwix Developer Emmanuel Engelhart led a usability study to better understand how to improve the user experience of the offline Wikipedia app Kiwix

We were inspired by a presentation that Trevor Parscal did last year which showcased how easy it is to run a usability study.

With the help of Sumana Harihareswara and numerous others, we conducted seven interviews that highlighted some of the pain points our users were facing.

Some of the quick observations were:

  • Bookmarks are too complicated;
  • Tabs are not intuitive;
  • Some common command key combinations are not supported.

The test script and full results are available, and we’re now using what we learned to guide our next development sprints.

Some of the issues have already been resolved, as they were either in development or quick fixes, while others will require more research.

All the tests were recorded and the videos are already available on Wikimedia Commons.

We’d like to thank our testers who helped us immensely!

It was also great to see how easy it is to run such a study. We have many great opportunities to do research like this at meet-ups, hackathons, conferences, Wikimania, etc.

I’d love to see our community do more informal testing sessions; running just one in a geographic region would quickly surface issues our users are facing.

Are you interested? Don’t wait! Do your own and let us know how it went, or leave a comment below if you want more information.

Tomasz Finc
Director of Mobile and Special Projects

 

Update on Offline Wikipedia Projects

The last week was a big week for expanding offline Wikipedia work.

Right now, offline refers to supporting read access to Wikimedia content without an Internet connection.  This increases the reach of the Wikipedia movement by providing more opportunities for people all over the world to access the materials.  Some of the recent initiatives surrounding this project were documented in Wikimedia’s tech blog about a month ago (for more detail regarding the purpose for offline work, see the offline strategy page).

In support of our offline readership work, we’re thrilled to announce the launch of a new feature on Wikipedia developed with our partners from PediaPress.  Last week we enabled ZIM export (the main file format in which offline materials are stored) for the existing PediaPress collections extension on English Wikipedia and numerous other wikis.  This means that individuals can now use the existing PediaPress Create a book tool and download it in a format which can be read offline (via an offline reader, such as Kiwix).  This is important because it opens new avenues for the creation of offline materials, for example, an openZim library hosting different offline “book” options.

Also, the English offline collection Wikipedia 0.8 was made officially available, after much hard work by the Wikipedia 1.0 Editorial Team.  This collection is an iteration in the process of developing a vetted collection of offline articles selected based on their quality and topical importance.  The main constraint with an offline product is the data size restrictions: the entirety of Wikipedia must somehow be condensed so that it fits on a CD, DVD, or USB stick.  Wikipedia 1.0 aims at creating the highest quality and most valuable subset of Wikipedia to meet those size requirements, and v0.8 is a precursor.  Wikipedia 0.8 is a general collection of just under 50K articles, It is available for Mac, PC, or Linux with a Linux or Okawix reader; some mobile phone versions will be available later this month as well.

More updates are sure to come on this offline front: Wikimedians around the world are actively assisting in the development of offline collections as well as distribution.  We are excited to support and document the momentum going forward.

Jessie Wild, Global Development

Update on Offline Wikimedia projects

Greetings,

With the annual fundraiser wrapping up, two sections of Wikimedia engineering are going to start moving more quickly: Mobile and Offline. The offline ecosystem has a lot of moving parts and it’s easy to get lost. The Wikimedia Foundation is currently focusing on three main areas of intervention: selection tools, file formats and offline apps.

Right now, “Offline” refers to supporting read access to Wikimedia content without an internet connection; increasing reach was identified during the Wikimedia strategic planning process as one of the movement priorities, and the first recommendation of the Offline task force was to “Simplify reuse of content from WMF projects”.

The first step in making Wikimedia content available offline is to select it. The Wikipedia Version 1.0 Editorial Team has been steadily releasing new versions of their beta Wikipedia collections, but technical limitations have hampered how quickly those can be finished. We’re going to evaluate the team’s tool set to see how to support them.

For example, we’re looking at extending the Wikipedia Release Version Tools to add features like sub-selection and comments (see an example of how the tool works for the Physics project).

Once the content has been selected, it needs to be packaged into a standard file format. The openZim format is an actively developed format for offline Wikipedia content, and we want to facilitate its integration into our general architecture.

Our first step is going to be the enhancement of the Collections extension to support openZim. This will be done by our partners from PediaPress, who have already started to work on it. They will need help from other community members to help test the new openZim files created by the extension.

After selection and packaging, the last remaining piece is the application that allows readers to access the content. Over the last many years, there have been lots of Wikipedia offline apps: BzReader, MzReader, WikiTaxi, WikiFilter, Kiwix, Okawix, etc. Some have come and gone, while others continue to thrive and are actively releasing new updates.

One thing we’ve learned looking at this ecosystem is that there is a strong need for a featured, easy-to-use and well supported offline app.

During the strategic planning process, one app emerged as a good candidate for the WMF to actively support: Kiwix. Kiwix has been around since 2007 and, through the great work of its lead developer Kelson, has steadily improved its feature set, platform support and overall stability.

In order to support this work and to help make the application even easier to use, we’ll be conducting a usability study on Kiwix, focused on search and browse, during the first quarter of 2011. Later this year, we’ll be focusing on an easier update cycle using openZim as the underlying storage format.

We hope 2011 will be full of exciting news about offline Wikimedia content. If you’d like to get involved, please participate in the strategic product discussion about Offline, or contact me if you’d like to help with development.

Tomasz Finc
Engineering Program Manager – Offline, Mobile, & Fundraising

Encyclopedia of Life curates Wikipedia’s species articles

There are more than 1.9 million animals, plants, and other forms of life on Earth. In May 2007, some of the world’s leading scientists announced the development of the Encyclopedia of Life (EOL) to document them all. Inspired by biologist E. O. Wilson’s TED Wish and supported by more than $25 million in funding, the project aggregates and makes accessible information about species ranging from 19th century journals to modern online databases.

See the page about Solanum lycopersicum, the garden tomato, as an example. Much of the information comes from Solanaceae Source, a specialized source of  names lists, species descriptions, specimen collections and publication lists for the genus Solanum. The Biodiversity Heritage Library provides historical public domain texts about the species from various published journals. Many other specialized and general resources contribute to the overall species page.

A Wikipedia article included in an Encyclopedia of Life species page. The yellow background indicates that no curator has reviewed the content yet. Click the image to enlarge.

You’ll also find a “Wikipedia” entry in the table of contents. It reveals a copy of the Wikipedia article about tomatoes. As of this writing, the article text has a yellow background.

This means that an Encyclopedia of Life curator has not yet reviewed the content for inclusion in EOL. An EOL species page can have one or more curators who select and validate information added to EOL pages. Wikipedia articles, where they exist, are included by default.

Once the article has been validated by a curator, the yellow background is removed. The information for curators and curation standards pages on EOL give some additional background on the curation process, which applies to all content objects in EOL. Specific guidelines have been written for curation of content from Wikipedia and Wikimedia Commons. We’re particularly pleased that EOL encourages its curators to improve Wikipedia directly if errors or omissions are found.

So far, more than 200 Wikipedia articles have been reviewed through this process. Reviewers classify the information as follows:

  • ‘trusted’ – reviewed by curator and not deemed to contain substantially incorrect information
  • ‘untrusted’ – reviewed by curator and deemed to include incorrect or unverifiable information
  • ‘inappropriate’ – reviewed by curator and deemed to not be eligible for inclusion in EOL for other reasons (e.g. too short to add value)

EOL makes the entirety of all review information (who reviewed what when, with what outcome) available through an Atom feed. This means that Wikipedians, and others, can use this information easily in the development of new applications.

The book creator tool makes it possible to order a printed and bound book from any Wikipedia article selection. A custom cover can be chosen. Nautilus photograph by Lee Berger, Creative Commons Attribution/Share-Alike License. (Click to enlarge.)

A proof-of-concept for expert reviews

Magnus Manske is a biochemist and programmer at the Sanger Institute in the United Kingdom. He is also a long-time Wikimedia volunteer, and wrote the first version of the PHP software used by Wikipedia, which later became MediaWiki. As a scientist, Magnus has advocated for the scientific community to use and improve Wikipedia, most recently as co-author of the paper Ten Simple Rules for Editing Wikipedia.

I informed Magnus about the new EOL review information, and suggested that we might want to explore using this information to generated printed books or PDF collections of reviewed articles. The software for exporting Wikipedia articles into books already exists, so it was just a matter of putting two and two together.

So, Magnus used the available data feed to create an automated tool that creates a list of all EOL-reviewed article versions in a form that can be used by Wikipedia’s book tool.

This makes it possible to download a PDF file or order a printed book that only contains EOL-reviewed versions of Wikipedia species articles.

To try it out, visit the page for Magnus’ example book. Click “Download PDF” to generate the (very large) PDF file that contains all the species articles, or “order printed book” to preview or order a printed book from PediaPress (which, as of this month, also offers books in color and hardcover format). If you want to remix or play with the book further, you can click “Open book creator”.

We’re very pleased with this first proof-of-concept, and are grateful to the Encyclopedia of Life team for engaging its community in the curation of Wikipedia articles. Both parties benefit: The Encyclopedia of Life enriches its species pages using the often well-developed Wikipedia content. Wikipedia benefits because EOL’s trusted reviewers add their stamp of approval to Wikipedia articles, which helps Wikipedia readers and editors alike. Where EOL reviewers do not approve, they are encouraged to edit the Wikipedia article.
I asked Bob Corrigan, EOL Product Manager and Acting Deputy Director, to give his take on this project. He writes: “This is definitely a win-win partnership. EOL is focused on providing very deep, structured access to trusted biodiversity information from our network of content partners and curators, and vetted Wikipedia articles can be a terrific gateway to this information. We see a closer relationship with Wikimedia as an important way to expand access to global knowledge about life on Earth.”

Hardcover book made from curated Wikipedia articles. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

Example page from the book. Photo credit: Guillaume Paumier; Nautilus photograph by Lee Berger. Creative Commons Attribution/Share-Alike License 3.0

A replicable model

Magnus’ implementation was already created with an eye to future extensibility. If you’re inclined to take a closer technical look, check out Magnus’ “Sifter-Books” script which generates the book data, and can potentially support multiple partner institutions/organizations providing article reviews. As of the time of this writing, Magnus has already added two additional groups who review Wikipedia articles, Rfam and Pfam, databases of RNA and protein families.

Moreover, Magnus has written a small proof-of–concept script which makes the existence of reviews visible on Wikipedia itself. You need to create a user account on the English Wikipedia and follow the installation instructions to use the script. Once installed, a “Reviews” tab will indicate available article reviews.

We look forward to exploring similar partnerships with subject-matter experts in institutions (like universities and libraries), scientific associations, and specialized knowledge communities. If you’re interested in this model, drop me a note (erik at wikimedia dot org).

Erik Moeller
Deputy Director, Wikimedia Foundation
Representative of Wikimedia in the Encyclopedia of Life Institutional Council

Wikimedia Chapters Work Together to Bring More Free Knowledge to Africa

Next Sunday, 20 Israeli students will leave for humanitarian work in Africa, equipped with portable offline Wikipedia thanks to a coordinated effort between Wikimedias from Israel, Switzerland and France.

Every year, the Africa Center at BGU, headed by Dr. Tamar Golan, sends a group of students on a three-month humanitarian expedition to developing countries in Africa. This year’s group is going to the Republic of Benin and the Republic of Cameroon.

To help, Wikimedia Israel decided to equip the students with computers running free software and containing an offline (static) version of the French Wikipedia, so that the students can bring free knowledge to Africans without access to the Internet. The students also have portable installations of the offline Wikipedia, so that they may install it on any other computers they may run across in Africa

We reached out to Hamakor, the Israeli Free and Open Source Software NGO, and Hamakor helped obtain computer donations, refurbished them and installed the Linux operating system on them.

Wikimedia Israel collaborated with members of Wikimedia Switzerland and Wikimedia France to produce an up-to-date static version of the French Wikipedia (numbering about 1 million entries, and including images), French being a major language of reading and writing in Cameroon and Benin.

Incidentally, the Linux version installed on those computers is called Ubuntu Linux, ‘Ubuntu’ being an African word (in the Zulu language) roughly translated as “unity of mankind” or “mutual reliance”.

We are very excited about this project that continues the Wikimedia Movement mission of supporting and promoting the distribution of free knowledge to everyone in the world.  We can’t wait to hear an update from the students next month.

Itzik Edri

Spokesman, Wikimedia Israel