Wikimedia blog

News from inside the Wikimedia Foundation.org

Deployments

MediaWiki 1.19 deployment to Wikimedia sites: Test it before it breaks

The logo of MediaWiki (a yellow sunflower surrounded by two pairs of blue square brackets) with gradients symbolizing its coming to age for the next version

Wikimedia sites will gradually be upgraded to version 1.19 of MediaWiki over the second half of February 2012.

This article is available in other languages on mediawiki.org.


Wikimedia engineers are putting the final touches to the latest version of MediaWiki, the software that powers Wikipedia and its sister sites. This version, labeled “1.19wmf1″, will be deployed to Wikimedia sites in stages, starting next week.

We’ve recently set up a Beta cluster, replicating a selection of Wikimedia wikis, where Wikimedians have tested the new version and checked that it worked reasonably well with their local wiki’s specific customizations.

Things are looking good, and the current plan is to run the deployment in five stages between February 15th and March 1st, 2012. The schedule may change based on unexpected issues, so you should refer to the MediaWiki 1.19 roadmap for an up-to-date schedule of when your wiki will be affected. (more…)

Scaling media storage at Wikimedia with Swift

Wikipedia is huge. Almost four million articles in English alone — but as they say, a picture is worth a thousand words (actually, it’s usually closer to several million). In terms of raw bits on disk, the largest project is clearly the Wikimedia Commons, the free media repository integrated with all of the Wikimedia projects. In addition, many projects allow their own local media uploads. As a result, across all wikis, Wikimedia stores millions of images, sounds, and other media files.

We’ve been able to manage the load for quite a while by using two servers with lots of local storage — (10 and 30TB), but we’re pushing against that limit and we would like a more fault-tolerant option. So, for the last few months, we have been working on replacing the infrastructure that holds all that data.

Our goal is to have a storage system that will allow us to scale more easily, and accept large collections of media from projects like Wiki Loves Monuments, and the U.S. National Archives’ donation of their collection of photographs by Ansel Adams.

After evaluating a number of options, we chose to pursue OpenStack Swift. Swift is a distributed object storage system with automatic replication, so that if one host has problems the requseted file is retrieved from another server with no interruption of service. Aside from meeting our needs around performance, reliability, and scalability, it is a good fit considering we are also using OpenStack products for Wikimedia Labs.

We have just completed the first milestone along the road to replacing our existing storage systems with Swift: all image thumbnails (scaled images such as a 320px version of a picture) are now stored on Swift. Our current production Swift cluster is made up of 4 back-end storage nodes with 22TB each and 2 front-end proxy nodes that handle user web requests. This new architecture provides us the scalability and reliability we need going forward.

Over the next few months we will build a second Swift cluster in our Virginia data center, then work on migrating all of the original media over to Swift as well. For more detail on the implementation and plan for Swift, you can read up on the documentation on Wikitech, ask questions in the comments below, or come and visit us in #wikimedia-tech on Freenode in IRC.

Ben Hartshorne
Operations
Wikimedia Foundation

Beta cluster allows Wikimedians to test upcoming software on Labs before deployment

Over the last few weeks, we’ve set up a test environment on Wikimedia Labs to replicate our production cluster and test new software before it’s deployed to Wikimedia sites. This will notably allow us to identify issues with the upcoming version of MediaWiki (1.19) before its deployment — but we need your help.

In case you haven’t heard yet, Wikimedia Labs is a platform aimed to make it easier for developers and system administrators to try out improvements to Wikimedia infrastructure, including MediaWiki, and to do analytics and bot work.

In the past, we’ve used prototype wikis to set up testing environments for upcoming releases of MediaWiki or to test new features. This has been helpful, but has suffered from lack of ongoing maintenance.

Over the holidays, I had the idea — with the upcoming 1.19 release, and the Labs servers newly online and available for non-WMF staff — of using Wikimedia Labs to duplicate the production cluster’s configuration in the Labs environment, and work with volunteers to help maintain this environment.

I particularly want to thank the following people for their work on this project:

  • Petr Bena been driving this almost all the way. He started setting this environment, the servers, apache configuration, and has been helping to keep it going on a pretty consistent basis.
  • John du Hart came along after Petr had already begun and lent his experience with setting up wiki farms. With his help, we put together a really great configuration that more closely duplicates what is in production.
  • Oren Bochman has stepped in to get search working on our micro-cluster. On Wikimedia sites, search has always relied on the help of volunteers. While we don’t yet have search working, Oren has helped us document the search back-end — which will help others set up search like we have on the cluster — and has already started to help us build the next generation of search.

Join in now to identify issues before they reach your wiki

We’ve recently opened this up for the real testing, so now is the time to jump in. Please look at the cluster’s SiteMatrix and find wikis to test. Try reading, editing, using your favorite gadgets, and so on as you normally would; treat it as a giant sandbox. If you find a problem, please report it on the problem reports page.

With your help, we can make the upcoming upgrade smoother.

Mark A. Hershberger, Bugmeister

FeaturedFeeds brings syndication feeds of featured Wikimedia content

Example of the English Wikipedia
featured articles feed generated by FeaturedFeeds

Yesterday, we deployed a new MediaWiki extension,  FeaturedFeeds, to all Wikimedia wikis. It creates syndication feeds (Atom or RSS) of Wikipedia’s featured content, such as featured articles or pictures of the day, giving the projects a new way to deliver content to readers and users.

For now, links to the feeds only appear in page metadata; in the future, we will add them to the sidebar on main pages, if communities wish so.

FeaturedFeeds integrates with the existing main page infrastructure: it uses data from templates to show content based on the current date.

Because user-generated content is involved, local wiki administrators need to make a few edits to MediaWiki pages to set up the extension. Instructions and a FAQ will guide you through the process. You can also use my edits to set up FeaturedFeeds on the English Wikipedia as an example.

If you have questions, you can ask for help on IRC, in #mediawiki and #wikimedia-mobile; we’ll be happy to help you set up the extension on your wiki.

Max Semenik
Mobile team developer

MediaWiki 1.18 deployment today to all Wikimedia sites

As reported two weeks ago, we’re planning to deploy MediaWiki 1.18 to all wikis, starting today (Tuesday, October 4, 23:00-03:00 UTC). We have been running MediaWiki 1.18 on several wikis already, representing about 2% of our total traffic. A big thank you to the early adopter wikis; it was very helpful getting some real world testing prior to deploying to the other 98%.

Our deployment process works like this. During our four hour deployment window, we’ll be deploying to several wikis sequentially. Our tentative plan for today is deploying first to fr.wikipedia.org, then pl.wikipedia.org, then en.wikipedia.org, and then probably a few more sequentially before deploying to the rest in bulk.  At the end of our window, we will be stopping deployment, even if we’re not done (scheduling a followup window if needed).

To report issues in real time (especially during the deployment window), IRC is the best venue; please join us in #wikimedia-tech on Freenode (web access). For those of you that are comfortable with Bugzilla and other development tools, we would love your help with confirming issues and getting appropriate issues filed in our bug tracker. If you don’t feel comfortable using Bugzilla, you can leave a message on the talk page of our announcement on meta. Our developers can keep track of issues much better when you use Bugzilla, so filing it there makes it more likely your problem will be noticed and eventually addressed.

Thanks for your patience!

Rob Lanphier
Director of Platform Engineering

Update: October 5 05:14 UTC – we didn’t get as far as we wanted, but we deployed to fr.wikipedia.org, pl.wikipedia.org, en.wikipedia.org, and commons.wikimedia.org.  We’re planning to have one more deployment window in a little less than 18 hours (October 5, 23:00-October 6, 03:00 UTC) to deploy to the remaining wikis.

Protocol-relative URLs enabled on all Wikimedia Foundation wikis

In July we enabled protocol-relative URLs on testwiki, and asked for bug reports. We did this in preparation for native HTTPS support for the sites. We received and fixed a number of protocol-relative related bugs, and then tested on a few of the larger wikis. We are now at a point where protocol-relative URL support is stable enough to enable it on all wikis, so today we’ve enabled it.

For information about what protocol-relative URLs are, why they are needed, and how it’ll affect you, see the post written in July. In brief: this changes most links we output in our content from looking like http://www.example.com to //www.example.com . The change shouldn’t affect you.

If you find any bugs related to protocol-relative URLs, please submit a bug report. Known issues are linked from the tracking bug.

Ryan Lane
Operations Engineer

EDIT Sep 28 14:29 UTC: Because of reported breakage in iOS clients, the API’s action=parse interface has been hacked not to return protocol-relative URLs. This is a temporary hack that you should not rely on; fix your clients instead. For details, see http://lists.wikimedia.org/pipermail/mediawiki-api-announce/2011-September/000024.html

Babel extension live on the WMF projects

Identifying language abilities has been real popular. The “Babel templates” are quite popular on the English Wikipedia and, many of the literally hundreds of templates have been copied to other wikis.

With a limited knowledge of a language, people can be really effective execute many tasks. It helps however when they are addressed in an understandable way.

At translatewiki.net we have been using the Babel extension for a long time; it does not use any templates and all the languages used in any of the WMF projects are supported. As it has been in use for so long, it became really rich in localisations.

Using the Babel extension is easy; my Babel user information for instance can be seen to the right and the syntax for this box can be seen below.

 

{{#babel:nl|en-4|de-2|fr-1}}

 

There have been many requests for the implementation of the Babel extension particularly by the newer and smaller projects. As people can choose to use this functionality and as it is particularly useful to people who are active on many projects, it has been implemented on all WMF wikis.

The documentation of the extension provides information about features like the inclusion of people in categories. These can be set by local admins.

Thanks,

Gerard Meijssen
Internationalization / Localization outreach consultant

MediaWiki 1.18 is coming

[Update 2011-09-24: The initial test deployment and stage 1 have gone well, with only minor glitches that we've mostly cleaned up.  Stage 2 and 3 are currently on schedule.  We've decided to add incubator.wikimedia.org to the list of wikis we'll be deploying to, which is reflected below.]

MediaWiki 1.18 will soon be deployed to all Wikimedia sites, including Wikipedia. As you may know, MediaWiki is the wiki software developed by the Wikimedia community, and 1.18 is the upcoming version of the software that has been in development since December.

Thanks to the completion of the heterogeneous deployment project, we are now able to run different versions of MediaWiki concurrently on Wikimedia sites. This means that we don’t have to upgrade all sites at the same time any more, which should limit the problems we encounter.

The deployment is scheduled to happen in several stages, starting next week:

Wikis in Stage 1 and 2 may experience more issues, so we plan to focus our attention to those wikis during these periods, and be particularly responsive. If you’d like to help make sure we catch problems before we roll out to your wiki, please help us test, by trying out the test wiki starting Tuesday, and report the issues you find.

(more…)

Filter preventing abusive edits comes to all wikis

The AbuseFilter extension for MediaWiki, which helps prevent vandalism on wikis, will be globally enabled on all Wikimedia projects later today.

AbuseFilter was developed by Andrew Garrett with support from the Wikimedia Foundation; it was first enabled on the English Wikipedia in March 2009.

Since then, many local wiki communities have asked individually for AbuseFilter to be turned on on their wiki. As of July 2011, AbuseFilter was already enabled on 66 wikis, out of the 843 wikis the Wikimedia Foundation hosts.

It recently appeared it would just be simpler to enable AbuseFilter by default on all wikis, rather than doing it on request.

When enabled, AbuseFilter comes with no built-in default filters, so no immediate change will be visible on wikis where it is enabled.

Contrary to other anti-vandalism tools, AbuseFilter works by analyzing edits before they’re saved, rather than trying to identify (and revert) them after the fact.

Filters, or “rules”, can be added to AbuseFilter to identify certain kinds of edits matching a pattern. Actions can be taken for these edits, like tagging the edit, preventing the user from saving the page, or even automatically blocking the user. The AbuseFilter documentation provides the format in which filters must be written.

A screenshot of the list of AbuseFilter rules on the English Wikipedia

AbuseFilter catches abusive edits matching defined patterns.

Because AbuseFilter has been in use on the English Wikipedia for more than two years, more details about how AbuseFilter works are available in their documentation; Instructions on how to create a filter are also available.

It is possible to export filters from a wiki, and to import them into another one.

AbuseFilter is an extremely powerful tool, with the potential of preventing edits, blocking users, and making a whole wiki unusable. Therefore, it must be used with extreme caution; filters should only be created and edited by administrators who understand their purpose and syntax.

AbuseFilter can also be used to identify edits that are not abusive, for tracking purposes. Tags can be automatically added to edits matching a certain pattern, thus giving editors and patrollers a heads-up about certain edits (see examples).

Because such tags can also be used to identify legit edits, AbuseFilter is sometimes referred to as “Edit filter”.

AbuseFilter offers the possibility for certain filters to be private, to prevent long-time abusers from knowing how their edits are being identified.

We hope this tool will prove useful to our community of editors and patrollers.

Guillaume Paumier
Technical communications manager

Does your Wikipedia mobile App expect our full content layout?

If so we have an upcoming change this week that you should be aware of. We’re in the final part of our new device detection testing that will automatically redirect any mobile agent we recognize over to its corresponding .m mobile gateway.This means that if your app declares a mobile UA as recognized by WURFL and connects directly to us we will redirect that traffic to .m.wikipedia.org and NOT .wikipedia.org.
Those apps that use an intermediate gateway which don’t have a mobile user agent will not be affected. If on the other hand your app does all of your logic then you will need to explicitly identify your UA to us.  Or, ensure that your UA contains “bot” to bypass redirection.

If this is not the behavior that you want then please let us know at know on meta or come find us on freenode #wikimedia-mobile.

Tomasz Finc

Director of Mobile and Special Projects