Wikimedia blog

News from inside the Wikimedia Foundation.org

Posts by Rob Lanphier

The MediaWiki Core group

This is the last in my series of introductory posts about Wikimedia Platform Engineering, focusing on the MediaWiki Core group.  This group is responsible for our sites’ stability, security, performance and architectural cleanliness.  This ends up translating into a lot of code review, along with infrastructure projects like disk-backed object cache, heterogeneous deployment, continuous integration, and performance-related work.  While it’s not a prerequisite, everyone on this team started off as a volunteer developer.  The whole engineering organization has some level of responsibility for our code review process, but this group has more of a primary responsibility for it than most groups.  We have an open position in this group.

(more…)

Technical Liaison; Developer Relations (TL;DR)

Golden Gate Bridge seen from the Presidio in San Francisco 42

This is yet another post about what everyone does in Platform Engineering, this time focusing on the Technical Liaison; Developer Relations (TL;DR) group.

The TL;DR group is responsible for development community relations, ensuring a healthy relationship between the Wikimedia Foundation and our volunteer development community. This team is responsible for removing obstacles to effective volunteer participation, communicating about what we’re up to now, and patrolling for new opportunities for volunteers to get involved and new volunteers to involve.

While everyone in Engineering is responsible for those things to some extent, this team helps fortify our commitment to this. And, just like they prodded me into the last two posts, they’ve prodded me to make this post. Once again, the reason for posting this is twofold: 1) because it’s generally good that everyone knows what it is that the WMF invests in, and 2) because we’re hiring, and we (still) want to get the word out. (more…)

Data analytics at Wikimedia Foundation

This post is a follow-on to my previous post “What is Platform Engineering?” .  In this post, I’ll describe the history of our analytics work, talk about how we derive and distribute our statistics, and ask you to join us in building our platform.  Summary:  we’re hiring, and we want to tell you what a great opportunity this is.

Our Data Analytics team is responsible for building out our logging and data mining infrastructure, and for making Wikimedia-related statistics useful to other parts of the Foundation and the movement.  Up until fairly recently, Erik Zachte has been the main analytics person for Wikimedia (with support from many generalists here), working first as a volunteer building stats.wikimedia.org, then on behalf of Wikimedia Foundation starting in 2008.  It started off as a large number of detailed page view and editor statistics about all Wikimedia wikis, large and small, and has since been augmented to include various summary formats and visualizations.  As the movement has grown, it has played an increasingly important role in helping guide our investments.

MediaWiki 1.18 deployment today to all Wikimedia sites

As reported two weeks ago, we’re planning to deploy MediaWiki 1.18 to all wikis, starting today (Tuesday, October 4, 23:00-03:00 UTC). We have been running MediaWiki 1.18 on several wikis already, representing about 2% of our total traffic. A big thank you to the early adopter wikis; it was very helpful getting some real world testing prior to deploying to the other 98%.

Our deployment process works like this. During our four hour deployment window, we’ll be deploying to several wikis sequentially. Our tentative plan for today is deploying first to fr.wikipedia.org, then pl.wikipedia.org, then en.wikipedia.org, and then probably a few more sequentially before deploying to the rest in bulk.  At the end of our window, we will be stopping deployment, even if we’re not done (scheduling a followup window if needed).

To report issues in real time (especially during the deployment window), IRC is the best venue; please join us in #wikimedia-tech on Freenode (web access). For those of you that are comfortable with Bugzilla and other development tools, we would love your help with confirming issues and getting appropriate issues filed in our bug tracker. If you don’t feel comfortable using Bugzilla, you can leave a message on the talk page of our announcement on meta. Our developers can keep track of issues much better when you use Bugzilla, so filing it there makes it more likely your problem will be noticed and eventually addressed.

Thanks for your patience!

Rob Lanphier
Director of Platform Engineering

Update: October 5 05:14 UTC – we didn’t get as far as we wanted, but we deployed to fr.wikipedia.org, pl.wikipedia.org, en.wikipedia.org, and commons.wikimedia.org.  We’re planning to have one more deployment window in a little less than 18 hours (October 5, 23:00-October 6, 03:00 UTC) to deploy to the remaining wikis.

MediaWiki 1.18 is coming

[Update 2011-09-24: The initial test deployment and stage 1 have gone well, with only minor glitches that we've mostly cleaned up.  Stage 2 and 3 are currently on schedule.  We've decided to add incubator.wikimedia.org to the list of wikis we'll be deploying to, which is reflected below.]

MediaWiki 1.18 will soon be deployed to all Wikimedia sites, including Wikipedia. As you may know, MediaWiki is the wiki software developed by the Wikimedia community, and 1.18 is the upcoming version of the software that has been in development since December.

Thanks to the completion of the heterogeneous deployment project, we are now able to run different versions of MediaWiki concurrently on Wikimedia sites. This means that we don’t have to upgrade all sites at the same time any more, which should limit the problems we encounter.

The deployment is scheduled to happen in several stages, starting next week:

Wikis in Stage 1 and 2 may experience more issues, so we plan to focus our attention to those wikis during these periods, and be particularly responsive. If you’d like to help make sure we catch problems before we roll out to your wiki, please help us test, by trying out the test wiki starting Tuesday, and report the issues you find.

(more…)

What is “Platform Engineering”?

If you’ve been following this blog or other Wikimedia Foundation updates closely over the past year, you may have seen several references to the “Platform Engineering” group (nee “General Engineering”), which is the group I’ve been managing for the past year. I’d like to explain who we are, and what we’re doing. We always strive for transparency as a group, but one ulterior motive for this particular narrative is that we’re hiring (more on that in a bit), and we hope this helps people understand what we’re looking for.

(more…)

Welcome, Sumana Harihareswara, Volunteer Development Coordinator

I’m thrilled to welcome Sumana Harihareswara to WMF Engineering!  Sumana will be filling the role of Volunteer Development Coordinator.  We interviewed many great candidates for the role, but decided the role would be best served by WMF continuing to work with Sumana on a contract basis.  She’ll be working from her home in New York.Sumana started in a part-time capacity back in March coordinating our participation in Google Summer of Code, as well as helping plan WMF’s participation in the Berlin Developer meeting happening later this month.  Starting after the Berlin Developer meeting, she’ll be dedicating her working time to Foundation issues.

In addition to the specific initiative above, she’ll be recruiting and encouraging volunteers more generally.  In the near term, she’ll be evangelizing movement priorities within the development community, and working toward matching interested volunteers and organizations to important movement work.  She’ll be working with Bugmeister Mark Hershberger on bug triage and finding volunteers to test and fix MediaWiki.  She’ll also gather some baseline metrics about our volunteer and corporate communities to measure our progress against.  And she’ll be coordinating WMF development work in other open source communities as appropriate.  Her Open Source Bridge talk last year (“ The Second Step: HOWTO encourage open source work at for-profits”) is particularly relevant to this last task.

Sumana is currently an active contributor in the GNOME community, as a writer and editor for GNOME Journal, and recently led the marketing effort for GNOME 3.0.  She is also a blogger at GeekFeminism, and a longtime participant in open source communities.  She has worked at the GNOME Foundation, QuestionCopyright.org, Collabora, Fog Creek Software, and Salon.com, and contributed to the AltLaw, Empathy, Miro, and Zeitgeist open source projects.  She’s written a weekly newspaper column and has performed (and taught) stand-up comedy.

Sumana intends on communicating with the MediaWiki and Wikimedia communities in many ways: via IRC and mailing lists, conference calls, and frequent visits to WMF headquarters from New York City and to relevant conferences, both MediaWiki-related and not.  For example, she’ll be speaking again this year at Open Source Bridge, giving a talk titled “Learn Tech Management in 45 Minutes”.

If you’re interested in learning more, or dropping a comment on her talk page, Sumana’s user page on MediaWiki.org has much more information.

Welcome, Sumana!

–Rob Lanphier, Engineering Programs Manager for General Engineering

MediaWiki 1.16.4 security release

MediaWiki 1.16.4 is a second security release this week.  Shortly after previous release (1.16.3), Masato Kinugawa discovered that one of the XSS problems that the 1.16.3 release was designed to address hadn’t been fully addressed, and reported bug 28507.  As a consequence, Internet Explorer 6 users visiting a site running 1.16.3 will still be vulnerable to an XSS attack.  After more thorough testing (thanks Roan Kattouw!), we’re releasing 1.16.4.

Full details are in Tim Starling’s 1.16.4 release announcement.  Sorry for the inconvenience of a second release, and thank you everyone involved in getting this fixed!

MediaWiki 1.16.3 security release

There is a new MediaWiki release available which addresses three security vulnerabilities:

  • A cross-site scripting (XSS) issue involving media uploads affecting Internet Explorer version 6 and earlier.   Note: fully addressing this issue requires web server configuration changes.  See bug 28235 and full announcement below for details (discovered by Masato Kinugawa).
  • A CSS validation problem in the wikitext parser.  This is a cross-site scripting (XSS) issue for all Internet Explorer clients, and a privacy loss issue for other clients. See bug 28450 and full announcement below for details (discovered by user Suffusion)
  • A transwiki import problem with  access control checks on form submission, which only affects wikis where this feature is enabled. For more details, see bug 28449 and full announcement below for details (discovered by MediaWiki developer Happy-Melon)

Full announcement from Tim Starling after the jump…

(more…)

Site fixes this week

We’re still in the middle of cleaning up some lingering issues from the 1.17 deployment, and despite our best efforts, you may see a little bit of quirkiness in the site:
  • One problem with the site since the deployment was a problem with our job queue, which meant that emails that were supposed to be sent from the site weren’t.  This backlog was removed last night, and a lot of pent-up email was sent.
  • There were some HTML cache invalidations that caused parts of the site to get overloaded for a few minutes.
  • Yesterday, we started the deployment of the category sorting improvements.  We deployed some modifications to the database today.  This resulted in a few hiccups on the site that we’ve since mostly recovered from.
Category collation

One key set of improvements in the MediaWiki 1.17 release is the category sorting work spearheaded by Aryeh Gregor. This code will eventually improve the sorting of categories in different languages, allowing us to choose the most appropriate sort order for the language. For now, we’re at least switching over to a more sensible sorting algorithm (Unicode Collation Algorithm (UCA)), and have made other improvements to sorting.

This set of changes required a modification of the database that we didn’t believe was risky, but was irreversible. Given how complicated the initial 1.17 deployment was, we decided to hold back on deploying this work.

There are still some maintenance scripts left to run before this work is fully-deployed, but most parts of this are done.

Other fixes
We’re also aware of and working on other problems with the job queue. We’re investigating these problems and hope to have these fixed soon.