Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts by Guillaume Paumier

Wikimedia engineering April 2013 report

Wikimedia engineering January 2013 report

Major news in January include:

Note: We’re also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
(more…)

Wikimedia sites to move to primary data center in Ashburn, Virginia

(Update on January 22nd, 2013, 20:00 (UTC): Our Operations team considers the migration to be over. Major disruption is no longer expected.)

Close-up on Wikimedia Foundation Servers

All Wikimedia sites, including Wikipedia, may encounter temporary interruptions on January 22–24, as they transition to servers in a new data center in Ashburn, Virginia (see more photos).

Next week, the Wikimedia Foundation will transition its main technical operations to a new data center in Ashburn, Virginia, USA. This is intended to improve the technical performance and reliability of all Wikimedia sites, including Wikipedia.

Engineering teams have been preparing for the migration to minimize inconvenience to our users, but major service disruption is still expected during the transition. Our sites will be in read-only mode for some time, and may be intermittently inaccessible. Users are advised to be patient during those interruptions, and share information in case of continued outage or loss of functionality.

The current target windows for the migration are January 22nd, 23rd and 24th, 2013, from 17:00 to 01:00 UTC (see other timezones on timeanddate.com).

Wikimedia sites have been hosted in our main data center in Tampa, Florida, since 2004; before that, the couple of servers powering Wikipedia were in San Diego, California. Ashburn is the third and newest primary data center to host Wikimedia sites.

A major reason for choosing Tampa, Florida as the location of the primary data center in 2004 was its proximity to founder Jimmy Wales’ home, at a time when he was much more involved in the technical operations of the site. In 2009, the Wikimedia Foundation’s Technical Operations team started to look for other locations with better network connectivity and more clement weather. Located in the Washington, D.C. metropolitan area, Ashburn offers faster and more reliable connectivity than Tampa, and usually fewer hurricanes.

The Operations team started to plan and prepare for the Virginia data center in Summer 2010. The actual build-out and racking of servers at the colocation facility started in February 2011, and was followed by a long period of hardware, system and software configuration. Traffic started to be served to users from the Ashburn data center in November 2011, in the form of CSS and JavaScript assets (served from “bits.wikimedia.org“).

We reached a major milestone in February 2012, when caching servers were set up to handle read-only requests for Wikipedia and Wikimedia content, which represent most of the traffic to Wikipedia and its sister sites. In April 2012, the Ashburn data center also started to serve media files (from “upload.wikimedia.org“).

Cacheable requests represent about 90 percent of our traffic, leaving 10 percent that requires interaction with our web (Apache) and database (MySQL) servers, which are still being hosted in Tampa. Until now, every edit made to a Wikipedia page has been handled by the servers in Tampa. This dependency on our Tampa data center was responsible for the site outage in August 2012, when a fiber cut severed the connection between our two locations.

Starting next week, the new servers in Ashburn will take on that role as well, and all our sites will be able to function fully without relying on the servers in Florida. The legacy data center in Tampa will continue to be maintained, and will serve as a secondary “hot failover” data center: servers will be in standby mode to take over, should the primary site experience an outage. Server configuration and data will be synchronized between the two locations to ensure a transition as smooth as possible in case of technical difficulties in Ashburn.

Besides just installing newer hardware, setting up the data center in Ashburn has also been an opportunity for architecture overhauls, like incremental improvements of the text storage system, and the move to an entirely new media storage system to keep up with the growth of the content generated and curated by our contributors.

Wikimedia’s technical infrastructure aims to be as open and collaborative as the sites it powers. Most of the configuration of our servers is publicly accessible, and the Wikimedia Labs initiative allows contributors to test and submit improvements to the sites’ configuration files.

The Wikimedia Foundation currently operates a total of about 885 servers, and serves about 20 billion page views a month, on a non-profit budget that relies almost entirely on donations from readers.

Guillaume Paumier
Technical Communications Manager

 

Wikimedia engineering December 2012 report

Major news in December include:

Note: We’re also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.

(more…)

Wikimedia engineering November 2012 report

Major news in November include:

Note: Like last month, we’re proposing a shorter and simpler version of this report for less technically savvy readers.

(more…)

Introducing Wikipedia’s new HTML5 video player

A new video player has been enabled on Wikipedia and its sister sites, and it comes with the promise of bringing free educational videos to more people, on more devices, in more languages.

The player is the same HTML5 player used in the Kaltura open-source video platform. It has been integrated with MediaWiki (the software that runs Wikimedia sites like Wikipedia) through an extension called TimedMediaHandler. It replaces an older Ogg-only player that has been in use since 2007.

The new player supports closed captions in multiple languages.

Based on HTML5, the new player plays audio and video files on wiki pages. It brings many new features, like advanced support for closed captions and other timed text. By allowing contributors to transcribe videos, the new player is a significant step towards accessibility for hearing-impaired Wikipedia readers. Captions can easily be translated into many languages, thus expanding their potential audience.

TimedMediaHandler also comes with other useful features, like support for the royalty-free WebM video format. Support for WebM makes it possible to seamlessly import videos encoded to that format, such as freely-licensed content from YouTube’s massive library.

Even further behind the scenes, TimedMediaHandler adds support for server-side transcoding, i.e. the ability to convert from one video format to another, in order to deliver the appropriate video stream to the user depending on their bandwidth and the size of the player. For example, support for mobile formats is available, although it is not currently enabled.

The player’s “Share” feature provides a short snippet of code to directly embed videos from Wikimedia Commons in web pages and blog posts, as is the case here.

Sponsored by Kaltura and Google, developers Michael Dale and Jan Gerber are the main architects of the successful launch of the new player. With the support of the Wikimedia Foundation’s engineering team and Kaltura, they have gone through numerous cycles of development, review and testing to finally release the fruits of years of work.

Efforts to better integrate video content to Wikipedia and its sister sites date back to early 2008, when Kaltura and the Wikimedia Foundation announced their first collaborative video experiment. Since then, incremental improvements have been released, but the deployment of TimedMediaHandler is the most significant achievement to date. (more…)

Wikimedia engineering October 2012 report

Major news in October include:

Note: As of last month, we’re proposing a shorter and simpler version of this report for less technically savvy readers.

(more…)

Wikimedia engineering September 2012 report

Major news in September include:

Wikimedia engineering August 2012 report

Major news in August include:

Note: Following a reader’s advice, we’re trying out slightly colored backgrounds to help readers skim through sections. Let us know in the comments how that works for you, and how to improve it for the next report. (more…)

Wikimedia engineering July 2012 report

Major news in July include:

  • Engineering presence at Wikimania 2012 in Washington, D.C., and the pre-Wikimania hackathon;
  • the launch of Limn, an open source dataviz toolkit developed by the Wikimedia analytics team;
  • the deployment of Article Feedback Version 5 (which supports free-text feedback and moderation thereof) to 10% of English Wikipedia articles

(more…)