This is the last in my series of introductory posts about Wikimedia Platform Engineering, focusing on the MediaWiki Core group. This group is responsible for our sites’ stability, security, performance and architectural cleanliness. This ends up translating into a lot of code review, along with infrastructure projects like disk-backed object cache, heterogeneous deployment, continuous integration, and performance-related work. While it’s not a prerequisite, everyone on this team started off as a volunteer developer. The whole engineering organization has some level of responsibility for our code review process, but this group has more of a primary responsibility for it than most groups. We have an open position in this group.
This is yet another post about what everyone does in Platform Engineering, this time focusing on the Technical Liaison; Developer Relations (TL;DR) group.
The TL;DR group is responsible for development community relations, ensuring a healthy relationship between the Wikimedia Foundation and our volunteer development community. This team is responsible for removing obstacles to effective volunteer participation, communicating about what we’re up to now, and patrolling for new opportunities for volunteers to get involved and new volunteers to involve.
While everyone in Engineering is responsible for those things to some extent, this team helps fortify our commitment to this. And, just like they prodded me into the last two posts, they’ve prodded me to make this post. Once again, the reason for posting this is twofold: 1) because it’s generally good that everyone knows what it is that the WMF invests in, and 2) because we’re hiring, and we (still) want to get the word out. (more…)
This post is a follow-on to my previous post “What is Platform Engineering?” . In this post, I’ll describe the history of our analytics work, talk about how we derive and distribute our statistics, and ask you to join us in building our platform. Summary: we’re hiring, and we want to tell you what a great opportunity this is.
Our Data Analytics team is responsible for building out our logging and data mining infrastructure, and for making Wikimedia-related statistics useful to other parts of the Foundation and the movement. Up until fairly recently, Erik Zachte has been the main analytics person for Wikimedia (with support from many generalists here), working first as a volunteer building stats.wikimedia.org, then on behalf of Wikimedia Foundation starting in 2008. It started off as a large number of detailed page view and editor statistics about all Wikimedia wikis, large and small, and has since been augmented to include various summary formats and visualizations. As the movement has grown, it has played an increasingly important role in helping guide our investments.
Earlier this month, about thirty MediaWiki developers and interested technologists gathered in New Orleans to learn and to work on Wikimedia’s technical infrastructure. We made broad progress on the infrastructure of innovation at Wikimedia (notes). Specifically:
- We are now much closer to officially opening the doors to Wikimedia Labs and giving far more people the ability to contribute to MediaWiki without having to set up and maintain their own development environments at home. Wikimedia Labs will provide hosted, virtualized test and development sandboxes for new and experienced programmers and systems administrators. Many developers got beta Labs accounts, we tested at a larger scale, and we fixed several bugs.
- Developers agreed to create a file backend abstraction layer to enable large-scale MediaWiki installations to use one of several storage systems to contain big collections of big media files. (Wikimedia plans on using Swift, which is open source.) Microsoft’s Ben Lobaugh and SAIC’s DJ Bauch collaborated towards improving MediaWiki’s performance on Microsoft technologies as well. Developers made architectural decisions, refactored some existing code, and improved documentation and tests for the SwiftMedia extension to MediaWiki.
We now have a continuous integration server up and running. This will continuously run tests checking on the latest new features and bugfixes that developers write, resulting in fewer bugs and faster development. Developers will need to write tests to reap the benefits, so Chad Horohoe taught a test-writing workshop.
- Max Semenik finished and demonstrated the first version of his API Query Sandbox. This allows software developers anywhere to experiment with ways to automatically get data from Wikipedia or other sites that run MediaWiki, thus enabling wider and deeper reuse of Wikimedia content.
- Operations folks continued the Puppetization of our infrastructure: they completely reworked Varnish management in Puppet, and worked on Puppet configurations for SwiftMedia testing. This configuration management work will ensure that ops can move faster and more confidently in building and maintaining Wikimedia infrastructure. And Canonical’s Mark Mims and Kapil Thangavelu worked on improving methods for Wikimedia developers “to spin up stacks of services within the labs environment” using Juju (more details).
- a switch from Subversion to Git in the next few months, Brion taught nearly everyone there how Git works (slides, audio), and how we’ll be using Git in the future. This change in our source code repository and workflow will, we hope, enable more speed and flexibility in development, both for WMF developers and community contributors. Since the engineering department is planning
- We prioritized and addressed several open requests for the operations team and defect reports about the latest version of MediaWiki, 1.18, which had just been deployed across WMF sites.
- Roan found and fixed an issue that was spouting symbolic link errors into our Apache logs, so now it’ll be easier for us to see more dangerous errors in those logs.
- Google Summer of Code students Salvatore Ingala and Kevin Brown made progress on integrating their summers’ work into MediaWiki as used and deployed by others; Salvatore and WMF developer Roan Kattouw have a plan for getting his user scripts improvements reviewed and deployed, so they can benefit Wikimedia readers and editors.
- A volunteer came in on Friday night knowing nothing about developing for MediaWiki, and by the end of the weekend had a working development environment on her laptop and had some ideas about how to contribute.
- We had substantive conversations about the summer internship program and about third-party collaboration that will affect how we work in the future.
Thanks to Ryan Lane and Dana Isokawa for organizing the event with me, and thanks to Launch Pad New Orleans for providing the venue!
Our next developers’ event is a hackathon in Mumbai November 18-20 concentrating on internationalization, localization, and mobile work. To find out about other upcoming Wikimedia technical events, check the meetings wiki page, and follow @MediaWikiMeet on Identi.ca or Twitter.
Volunteer Development Coordinator
As reported two weeks ago, we’re planning to deploy MediaWiki 1.18 to all wikis, starting today (Tuesday, October 4, 23:00-03:00 UTC). We have been running MediaWiki 1.18 on several wikis already, representing about 2% of our total traffic. A big thank you to the early adopter wikis; it was very helpful getting some real world testing prior to deploying to the other 98%.
Our deployment process works like this. During our four hour deployment window, we’ll be deploying to several wikis sequentially. Our tentative plan for today is deploying first to fr.wikipedia.org, then pl.wikipedia.org, then en.wikipedia.org, and then probably a few more sequentially before deploying to the rest in bulk. At the end of our window, we will be stopping deployment, even if we’re not done (scheduling a followup window if needed).
To report issues in real time (especially during the deployment window), IRC is the best venue; please join us in #wikimedia-tech on Freenode (web access). For those of you that are comfortable with Bugzilla and other development tools, we would love your help with confirming issues and getting appropriate issues filed in our bug tracker. If you don’t feel comfortable using Bugzilla, you can leave a message on the talk page of our announcement on meta. Our developers can keep track of issues much better when you use Bugzilla, so filing it there makes it more likely your problem will be noticed and eventually addressed.
Thanks for your patience!
Director of Platform Engineering
Update: October 5 05:14 UTC – we didn’t get as far as we wanted, but we deployed to fr.wikipedia.org, pl.wikipedia.org, en.wikipedia.org, and commons.wikimedia.org. We’re planning to have one more deployment window in a little less than 18 hours (October 5, 23:00-October 6, 03:00 UTC) to deploy to the remaining wikis.
[Update 2011-09-24: The initial test deployment and stage 1 have gone well, with only minor glitches that we've mostly cleaned up. Stage 2 and 3 are currently on schedule. We've decided to add incubator.wikimedia.org to the list of wikis we'll be deploying to, which is reflected below.]
MediaWiki 1.18 will soon be deployed to all Wikimedia sites, including Wikipedia. As you may know, MediaWiki is the wiki software developed by the Wikimedia community, and 1.18 is the upcoming version of the software that has been in development since December.
Thanks to the completion of the heterogeneous deployment project, we are now able to run different versions of MediaWiki concurrently on Wikimedia sites. This means that we don’t have to upgrade all sites at the same time any more, which should limit the problems we encounter.
The deployment is scheduled to happen in several stages, starting next week:
- Monday, September 19, 23:00-01:00 UTC — Production test: test2.wikipedia.org – this stage will ensure that 1.18 is compatible with the rest of our production infrastructure. There’s a small chance that changes here could affect all wikis.
- Wednesday, September 21, 23:00-03:00 UTC — Stage 1: simple.wikipedia.org, simple.wiktionary.org, usability.wikimedia.org, strategy.wikimedia.org, mediawiki.org, he.wikisource.org
- Monday, September 26, 23:00-03:00 UTC — Stage 2: meta.wikimedia.org, en.wikiquote.org, en.wikibooks.org, beta.wikiversity.org, eo.wikipedia.org, nl.wikipedia.org, incubator.wikimedia.org
- Tuesday, October 4, 23:00-03:00 UTC — Stage 3: remaining wikis.
Wikis in Stage 1 and 2 may experience more issues, so we plan to focus our attention to those wikis during these periods, and be particularly responsive. If you’d like to help make sure we catch problems before we roll out to your wiki, please help us test, by trying out the test wiki starting Tuesday, and report the issues you find.
If you’ve been following this blog or other Wikimedia Foundation updates closely over the past year, you may have seen several references to the “Platform Engineering” group (nee “General Engineering”), which is the group I’ve been managing for the past year. I’d like to explain who we are, and what we’re doing. We always strive for transparency as a group, but one ulterior motive for this particular narrative is that we’re hiring (more on that in a bit), and we hope this helps people understand what we’re looking for.
I’m thrilled to welcome Sumana Harihareswara to WMF Engineering! Sumana will be filling the role of Volunteer Development Coordinator. We interviewed many great candidates for the role, but decided the role would be best served by WMF continuing to work with Sumana on a contract basis. She’ll be working from her home in New York.Sumana started in a part-time capacity back in March coordinating our participation in Google Summer of Code, as well as helping plan WMF’s participation in the Berlin Developer meeting happening later this month. Starting after the Berlin Developer meeting, she’ll be dedicating her working time to Foundation issues.
In addition to the specific initiative above, she’ll be recruiting and encouraging volunteers more generally. In the near term, she’ll be evangelizing movement priorities within the development community, and working toward matching interested volunteers and organizations to important movement work. She’ll be working with Bugmeister Mark Hershberger on bug triage and finding volunteers to test and fix MediaWiki. She’ll also gather some baseline metrics about our volunteer and corporate communities to measure our progress against. And she’ll be coordinating WMF development work in other open source communities as appropriate. Her Open Source Bridge talk last year (“ The Second Step: HOWTO encourage open source work at for-profits”) is particularly relevant to this last task.
Sumana is currently an active contributor in the GNOME community, as a writer and editor for GNOME Journal, and recently led the marketing effort for GNOME 3.0. She is also a blogger at GeekFeminism, and a longtime participant in open source communities. She has worked at the GNOME Foundation, QuestionCopyright.org, Collabora, Fog Creek Software, and Salon.com, and contributed to the AltLaw, Empathy, Miro, and Zeitgeist open source projects. She’s written a weekly newspaper column and has performed (and taught) stand-up comedy.
Sumana intends on communicating with the MediaWiki and Wikimedia communities in many ways: via IRC and mailing lists, conference calls, and frequent visits to WMF headquarters from New York City and to relevant conferences, both MediaWiki-related and not. For example, she’ll be speaking again this year at Open Source Bridge, giving a talk titled “Learn Tech Management in 45 Minutes”.
If you’re interested in learning more, or dropping a comment on her talk page, Sumana’s user page on MediaWiki.org has much more information.
–Rob Lanphier, Engineering Programs Manager for General Engineering
There is a new MediaWiki release available which addresses three security vulnerabilities:
- A cross-site scripting (XSS) issue involving media uploads affecting Internet Explorer version 6 and earlier. Note: fully addressing this issue requires web server configuration changes. See bug 28235 and full announcement below for details (discovered by Masato Kinugawa).
- A CSS validation problem in the wikitext parser. This is a cross-site scripting (XSS) issue for all Internet Explorer clients, and a privacy loss issue for other clients. See bug 28450 and full announcement below for details (discovered by user Suffusion)
- A transwiki import problem with access control checks on form submission, which only affects wikis where this feature is enabled. For more details, see bug 28449 and full announcement below for details (discovered by MediaWiki developer Happy-Melon)
Full announcement from Tim Starling after the jump…
For those of you who have been eagerly awaiting the REST APIs, wait no more. You can now find our Bugzilla server’s APIs at https://bugzilla.wikimedia.org/bzapi. Documentation about the APIs is available at https://wiki.mozilla.org/Bugzilla:REST_API.
If you find any issues with the bugzilla instance or the APIs, submit a bug. Make sure you set the Component to Bugzilla so that it gets my attention.