Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Wikimedia engineering June 2011 report

Major news this month include:

  • the network setup in our new datacenter, that opened the way to new server setup and backups;
  • progress on features to encourage and facilitate participation, like the Visual editor groundwork, and the WikiLove button;
  • productive community testing on our now mobile front-end and the Kiwix download manager;
  • the release of MediaWiki 1.17.0;
  • the first commits by our Summer of Code students;
  • major progress on our code review backlog.

Note: This month, we’re trying out a slightly modified format for the report. Hover your mouse over the green question marks ([?]) to see a description of a particular project.

Events

Recent events

Upcoming events

  • OSCON (July 25-29, Portland, Oregon, USA) — A delegation of about a dozen Wikimedia engineers will be attending the Open Source Convention in July. Many of them will give talks (see full schedule).
  • Wikimania (August 2-7, Haifa, Israel) — Another delegation of about a dozen Wikimedia engineers will be attending the Wikimania conference in August. Besides the Developer Days and OpenZIM Developers Meeting, they will also give several talks; the full schedule is now available.
  • Check out the Software deployments page on the wikitech wiki for up-to-date information on the upcoming deployments to Wikimedia sites.

Personnel

Job openings

Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.

The following positions have opened this month:

Requests for proposals:

The following positions are still open: Software Developer (Features), Systems Engineer (Data Analytics), Operations Engineer, Networking Contractor (Amsterdam), Software Developer (Rich Text Editing, Features), Product Manager (Features), Software Developer (Front-end) and Software Developer (Back-end).

Short news

Operations

Site infrastructure

  • Virginia Data Center [?] — In June, our network setup in eqiad, our facility at Equinix in Ashburn, Virginia, was finished. Two independent transport connections between our two data centers in Tampa and Ashburn were installed, and local IP transit (Internet connectivity) is now available in Ashburn as well. We started replicating our thumb server from ms5 (tampa) to ms1004 (eqiad). All our 48 database servers are now up and running, and database replication will start this week. Several services should come online from eqiad in July.
  • HTTPS & IPv6Ryan Lane announced that Wikimedia sites would be switching to protocol-relative URLs in July, as part of the work to properly support HTTPS. Support for the HTTPS cluster was added to geoiplookup, and an nginx logging module was written for udp2log. HTTPS has been tested for bits, upload, test Wikipedia, and some smaller wikis.
  • Summer of Research 2011 — Asher Feldman and Ryan Lane created the systems infrastructure for the Summer of Research team to perform data mining and analysis work.
  • Mobile site issues — In June, we noticed a lot of 500 errors for image resources on our mobile platform. The Operations team enhanced the capacity, performance and stability of our mobile gateway by adding two new servers and upgrading software.

Testing environment

  • Virtualization test cluster [?] — Networking was set up, and instances can now be created and register with puppet. Work is now being done to move the puppet configuration into a git repository so that instances can fully build themselves. This project has been temporarily slowed down in favor of deploying HTTPS.

Backups and data archives

  • Data Dumps [?] — The first run of the English Wikipedia dumps on the new performant server was somewhat disappointing. More than half of the files were truncated, which indicates running 32 parallel jobs is too demanding. Testing will be done in July to find the optimal number of concurrent threads. In the meantime, the truncated files are being regenerated in smaller batches. Code to detect truncated dump files was committed and will soon be deployed. This code uncovered an unrelated issue with the Chinese Wikipedia history dumps that has been around for almost a year. Installs are now fully puppetized, with just one host left to upgrade.
  • Backups [?] — Network connectivity to our new data center is now available. The first data is being copied and/or replicated onto the new servers and storage systems in Ashburn, and all important data will be present in our new facility in July.

Features Engineering

Editing tools

  • Visual editor 0.1 [?]Trevor Parscal continued to work on the front-end of the visual editor, and specifications for accessing the editing surface via the API. A hybrid rendering approach appears to be the best strategy for the visual editor. Neil Kandalgaonkar continued to work on the middleware, DOM and transactions. Neil also continued to work on a demo to integrate MediaWiki and Etherpad. With Alolita Sharma, they planned their upcoming sprints. Neil and Trevor are posting about their work to the wikitext-l list.
  • FlaggedRevs [?]Aaron Schulz improved user preferences and changed the way statistics are stored in the database, among other minor improvements. Chad Horohoe helped review the backlog of unreviewed commits.

Content Quality and Editorial Tools

  • Article Feedback [?] — Additional features were added in June, like a dashboard tracking articles receiving low ratings. Roan Kattouw started to implement the UDP back-end to provide clicktracking metrics to assess user engagement. The community provided feedback and bug reports, and the development team addressed the concerns raised, for example by implementing a user preference to hide the tool. Tooltips were also added to provide more information on the meaning of the star ratings. Dario Taraborelli continued to evaluate the data provided by the articles already showing the feature. The incremental roll-out to all articles on the English Wikipedia is planned to be completed by mid-July.

Participation and editor retention

  • StructuredProfile [?] — This possible feature aims to make it easier for new editors to fill out their profile pages with meaningful information about their background and interests, and surface select profile information to experienced editors within lists such as recent changes, watchlist, etc. The ideas are still in development, and feedback is welcome on the StructuredProfile Talk page.
  • LiquidThreads 3.0 [?] — WMF work on this project was mostly on hold in June due to limited resources, that were affected in priority to supporting the 2011 Board Election and the MoodBar.

Multimedia Tools

  • Upload wizard [?]Neil Kandalgaonkar continued to fix bugs, and added an additional functionality to show thumbnails before upload in modern browsers.

MediaWiki infrastructure

  • ResourceLoader 2.0 [?] — This project was mostly on hold in June due to the lack of engineering resources. Work is planned to resume in July.

Wikimedia Labs

  • Parser [?]Brion Vibber continued to work on the parser plan, and also moved the “parser playground” gadget to an extension. He invited the developer community to use it and provide feedback (read more).

Mobile

  • Mobile Research [?] — In June, Parul Vora and Mani Pande completed their fieldwork in Brazil, consisting of 16 interviews in São Paulo, Salvador and Porto Alegre. They conducted extensive in-home interviews with three kinds of participants: readers of Wikipedia on a mobile phone, potential mobile readers (i.e. who currently use a computer, but could become mobile readers) and editors (primarily of the Portuguese Wikipedia, and to a lesser degree the English Wikipedia). The team also received about 6 proposals from US firms in response to our RfP, to conduct research in three cities in the US. The mobile survey is scheduled to be launched at the end of July.
  • Mobile site rewrite [?]Tomasz Finc sent a call for testers to help test the prototype in English, Japanese and Hebrew. Feedback is now being addressed by the mobile team, who is tracking fixes and new feature requests in bugzilla. Patrick Reilly and Asher Feldman also worked together to profile the MobileFrontend extension (formerly “PatchOutputMobile”) and prep it for deployment. Next steps include its integration with our Varnish and Squid caching architecture, so that we can have the advantages of the WURFL mobile device database with an acceptable performance.

Special projects

Fundraising support

  • 2011 Fundraiser [?]Arthur Richards, Ryan Kaldari and Katie Horn are wrapping up their first development sprint, and preparing for the next one beginning on July 5. Their work focuses on new features for CentralNotice to better facilitate banner management, as well as back-end enhancements to the donation processing pipeline.

Offline

  • Wikipedia version tools [?]GSoC student Yuvaraj Pandian continued to port the WP 1.0 bot to a MediaWiki extension. Mentored by Arthur Richards, and supported by WP 1.0 Bot author/maintainer User:CBM, Yuvaraj implemented an assessment template processing feature, and is now working on a WP 1.0 bot replacement feature that will automatically include real-time assessment statistics on project pages. A feature to filter and select articles based on assessment criteria is planned to be added in July.
  • Collections [?]Tomasz Finc discussed future work with PediaPress, Wikimedia Italia and others. Possible directions include scaling the Collections extension to output much larger collections, integrating it with the new Kiwix download manager, improving general user experience and making it a general purpose solution for article selection.
  • Kiwix UX initiative [?]Tomasz Finc posted the videos from our Berlin usability testing session, and released the next beta release of Kiwix. Users of Kiwix can now easily download new openZim files right within the interface. We’re looking at connecting it to the Collections extension (see above) so that anyone can easily download new books collections. If you’re interested in participating, please check out the volunteer program. The team is especially in need of an expert who can help with some complex libtool issues.

Platform Engineering

MediaWiki Core

  • HipHop support [?] — HipHop support is still planned to be part of MediaWiki 1.20. In the meantime, we’re looking for volunteers to help us package it for different distributions. Please contact Sumana Harihareswara or the wikitech-l mailing list if you have experience with packaging or would like to get involved in this area.
  • Academic publications authentication proxyChad Horohoe started a project whose goal is to allow selected Wikimedians to access third-party academic publishing sites to help with content verifiability. The authentication challenges this entails are not trivial; Ryan Lane is also involved in this project, particularly because of his previous experience with OpenID (which may be used as a mechanism for tying to CentralAuth).
  • Shell requestsPriyanka Dhanda continued to process shell requests, after it appeared that a harmonization of wiki configuration options wouldn’t be a significant time saver.
  • Projects on hold — The App-level monitoring and Configuration management projects were delayed in June, in favor of other work like the 1.17 release. Some of them will be resumed in July.

Wikimedia analytics

  • Wikimedia Report Card 2.0 [?]Erik Zachte and Nimish Gautam started a development sprint and worked on the back-end infrastructure, supported by Asher Feldman & Sam Reed. The information stored in a database is accessed via a new MediaWiki extension (“MetricsReporting”, see in SVN), and the visualization part uses JQplot. The team hopes to demonstrate a prototype for the next report card in early July.

Technical Liaison; Developer Relations

  • Engineering project documentation [?]Guillaume Paumier finalized the infrastructure for project pages, using templates and transclusion. Because of the tools’ limits, full automation wasn’t possible. He also continued to update project pages and statuses.
  • translatewiki.net support — The list of 500 most used MediaWiki interface messages was updated to help translators focus on the messages with the most impact. The Translate extension may be reviewed in July to be used for content translation on Wikimedia sites, e.g. on meta-wiki.

This article was written by Mark Bergsma, Tomasz Finc, Erik Möller, Alolita Sharma, CT Woo, Rob Lanphier, Howie Fung, Ryan Lane, Ariel Glenn, Sumana Harihareswara & Guillaume Paumier. See full revision history. A wiki version is also available.

Comments are closed.