Wikimedia blog

News from inside the Wikimedia Foundation.org

MediaWiki

Tech meetup moves Wikimedia infrastructure forward

Earlier this month, about thirty MediaWiki developers and interested technologists gathered in New Orleans to learn and to work on Wikimedia’s technical infrastructure.  We made broad progress on the infrastructure of innovation at Wikimedia (notes).  Specifically:

NOLA Hackathon 16

Tim Starling and DJ Bauch driving towards greater media file storage system independence and robustness

  • We are now much closer to officially opening the doors to Wikimedia Labs and giving far more people the ability to contribute to MediaWiki without having to set up and maintain their own development environments at home.  Wikimedia Labs will provide hosted, virtualized test and development sandboxes for new and experienced programmers and systems administrators.  Many developers got beta Labs accounts, we tested at a larger scale, and we fixed several bugs.
  • Developers agreed to create a file backend abstraction layer to enable large-scale MediaWiki installations to use one of several storage systems to contain big collections of big media files.  (Wikimedia plans on using Swift, which is open source.) Microsoft’s Ben Lobaugh and SAIC’s DJ Bauch collaborated towards improving MediaWiki’s performance on Microsoft technologies as well.  Developers made architectural decisions, refactored some existing code, and improved documentation and tests for the SwiftMedia extension to MediaWiki.
  • Chad Horohoe teaching developers about unit testing

    Chad Horohoe teaching developers unit testing

    We now have a continuous integration server up and running.  This will continuously run tests checking on the latest new features and bugfixes that developers write, resulting in fewer bugs and faster development. Developers will need to write tests to reap the benefits, so Chad Horohoe taught a test-writing workshop.

  • Max Semenik finished and demonstrated the first version of his API Query Sandbox.  This allows software developers anywhere to experiment with ways to automatically get data from Wikipedia or other sites that run MediaWiki, thus enabling wider and deeper reuse of Wikimedia content.
  • Operations folks continued the Puppetization of our infrastructure: they completely reworked Varnish management in Puppet, and worked on Puppet configurations for SwiftMedia testing. This configuration management work will ensure that ops can move faster and more confidently in building and maintaining Wikimedia infrastructure. And Canonical’s Mark Mims and Kapil Thangavelu worked on improving methods for Wikimedia developers “to spin up stacks of services within the labs environment” using Juju (more details).
  • NOLA Hackathon 28

    Brion Vibber leading developers into the "glorious Git future"

    Since the engineering department is planning a switch from Subversion to Git in the next few months, Brion taught nearly everyone there how Git works (slides, audio), and how we’ll be using Git in the future. This change in our source code repository and workflow will, we hope, enable more speed and flexibility in development, both for WMF developers and community contributors.
  • We prioritized and addressed several open requests for the operations team and defect reports about the latest version of MediaWiki, 1.18, which had just been deployed across WMF sites.
  • Roan found and fixed an issue that was spouting symbolic link errors into our Apache logs, so now it’ll be easier for us to see more dangerous errors in those logs.
  • Google Summer of Code students Salvatore Ingala and Kevin Brown made progress on integrating their summers’ work into MediaWiki as used and deployed by others; Salvatore and WMF developer Roan Kattouw have a plan for getting his user scripts improvements reviewed and deployed, so they can benefit Wikimedia readers and editors.
  • A volunteer came in on Friday night knowing nothing about developing for MediaWiki, and by the end of the weekend had a working development environment on her laptop and had some ideas about how to contribute.
  • We had substantive conversations about the summer internship program and about third-party collaboration that will affect how we work in the future.

NOLA Hackathon 1

Launch Pad New Orleans, a great venue

We also ate dinner together, walked Bourbon Street, and generally got to know colleagues we’d never met before.  I expect these relationships will bear fruit for years to come.

Thanks to Ryan Lane and Dana Isokawa for organizing the event with me, and thanks to Launch Pad New Orleans for providing the venue!

Our next developers’ event is a hackathon in Mumbai November 18-20 concentrating on internationalization, localization, and mobile work.  To find out about other upcoming Wikimedia technical events, check the meetings wiki page, and follow @MediaWikiMeet on Identi.ca or Twitter.

Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation

Announcing the October 2011 Coding Challenge

Coding Challenge LogoGreat programmers drive any successful tech organization, and great programmers can be hard to find.Fortunately, the Wikimedia Foundation has a unique advantage: millions of unique visitors, every single day.It’s Wikipedia’s global impact which has enabled us to mobilize hundreds of thousands of donors every year to support our mission in our annual fundraising campaign.

We wanted to find out if we could find some of the world’s best programmers using similar means, to cultivate potential future volunteers and job candidates.  Thus, the idea of the Coding Challenge was born.

This is, admittedly, an experiment, and we’re not entirely sure how it will go.  We’ve structured it like a contest, and contests can be tricky, because they have rules.  One of the most important rules: only one contestant per challenge can win The Grand Prize (an all-expenses paid trip to an eligible Wikimedia event).   We run the risk of creating an ultra-competitive environment, in which people care more about winning than about helping to build a better Wikipedia.

Let’s reiterate, then: the goal of the contest is to find the best programmers — and at WMF, we get to decide exactly what that means.  (See the part in the rules about “sole discretion”.)  And a great programmer must write great code, of course — but the greatest programmers know that it means a lot more than that, too.  Is the documentation good?  Have ideas been exchanged on the wikitech-l mailing list?  Were participants generous with their time and ideas?  Do we see those ideas proliferating through the code of others?

There’s also a lot of value in this contest even to those who don’t “win”.  Maybe there’s only one grand prize per challenge, but WMF can, and will, bestow accolades to everyone who writes good code and shares it with everyone else.  And those things matter: when a potential employer comes to call, and you can point to the nice folks at Wikipedia to vouch for your work, that’s no small thing.

There are great potential programmers all over the globe.  If we can convince even a small fraction of you to accept one of these challenges, we will consider this little experiment a resounding success.

The Coding Challenge will run until November 7, 2011, 23:59 UTC.

Greg DeKoenigsberg, Coding Challenge Coordinator
Erik Moeller, VP of Engineering and Product Development

MediaWiki 1.18 deployment today to all Wikimedia sites

As reported two weeks ago, we’re planning to deploy MediaWiki 1.18 to all wikis, starting today (Tuesday, October 4, 23:00-03:00 UTC). We have been running MediaWiki 1.18 on several wikis already, representing about 2% of our total traffic. A big thank you to the early adopter wikis; it was very helpful getting some real world testing prior to deploying to the other 98%.

Our deployment process works like this. During our four hour deployment window, we’ll be deploying to several wikis sequentially. Our tentative plan for today is deploying first to fr.wikipedia.org, then pl.wikipedia.org, then en.wikipedia.org, and then probably a few more sequentially before deploying to the rest in bulk.  At the end of our window, we will be stopping deployment, even if we’re not done (scheduling a followup window if needed).

To report issues in real time (especially during the deployment window), IRC is the best venue; please join us in #wikimedia-tech on Freenode (web access). For those of you that are comfortable with Bugzilla and other development tools, we would love your help with confirming issues and getting appropriate issues filed in our bug tracker. If you don’t feel comfortable using Bugzilla, you can leave a message on the talk page of our announcement on meta. Our developers can keep track of issues much better when you use Bugzilla, so filing it there makes it more likely your problem will be noticed and eventually addressed.

Thanks for your patience!

Rob Lanphier
Director of Platform Engineering

Update: October 5 05:14 UTC – we didn’t get as far as we wanted, but we deployed to fr.wikipedia.org, pl.wikipedia.org, en.wikipedia.org, and commons.wikimedia.org.  We’re planning to have one more deployment window in a little less than 18 hours (October 5, 23:00-October 6, 03:00 UTC) to deploy to the remaining wikis.

Google Summer of Code students reach project milestones

Congratulations to the seven Google Summer of Code students who made it through the summer of 2011! They all accomplished a great deal, but want to continue contributing to ensure their work maximally benefits Wikimedia.

Google Summer of Code logo 2011

MediaWiki participated in Google Summer of Code 2011.

Yuvi Panda‘s assessment parsing/aggregating extension aims “to make it easier to select and export article selections for various offline collections.” Yuvi needs some code review and suggestions on how to improve it to meet the Foundation’s quality standards for deployability, as he wrote the developers’ mailing list.

Salvatore Ingala worked on making gadgets customizable. As he elaborated, that means:

  • “allowing gadgets to easily declare the list of configuration
    variables they have;
  • allowing users to easily change those settings, with an easy-to-use
    UI integrated to the Special:Preferences page.”

The next step is merging his code into trunk, which Salvatore’s planning with other MediaWiki developers.

Kevin Brown created the ArchiveLinks project to address the problem of linkrot on Wikipedia:

In articles we often cite or link to external URLs, but anything could happen to content on other sites — if they move, change, or simply vanish, the value of the citation is lost. ArchiveLinks rewrites external links in Wikipedia articles, so there is a ‘[cached]‘ link immediately afterwards which points to the web archiving service of your choice. This can even preserve the exact time that the link was added, so for sites which archive multiple versions of content (such as the Internet Archive) it will even link to a copy of the page that was made around the time the article was written.

Kevin’s next step: getting a security review of his code, getting a starter feed set up so that the Internet Archive can start archiving it, and campaigning to interest Wikimedians and thus eventually get consensus to turn it on. At least one Wikimedian has already praised Kevin for his work.

Akshay Agarwal wrote a MediaWiki extension, SignupAPI, that makes it easier for a new user to create an account. “This extension creates a special page that cleans up SpecialUserLogin from signup related stuff, adds an API for signup, adds sourcetracking for account creation & provides Ajax-ified validation for signup form.” Akshay’s waiting for code review and discussion before the project can move forward further and benefit Wikimedia users.

MediaWiki logo

Seven students contributed to various parts of MediaWiki, the wiki software that supports WMF sites.

Yuvi, Salvatore, Kevin, and Akshay all worked on features that they aim to get into Wikimedia Foundation-run wikis, such as Wikipedia, Wikisource, Wikinews, etc., sooner rather than later. In contrast, three students worked on extensions that will primarily benefit the larger MediaWiki community. For example, Yevhenii Vlasenko‘s project was a “UserStatus” feature for SocialProfile. The SocialProfile extension is not currently deployed on any WMF wikis, but will benefit several other MediaWiki administrators and users. Zhenya finished his work but would like to continue by integrating better with social networks.

And two students worked on Semantic MediaWiki, which is also not currently deployed on any Wikimedia Foundation sites. Devayon Das made a “QueryCreator” and other improvements, and hopes to simplify its layout, make its interface easier to use, and add some features. And Ankit Garg worked on “Semantic Schemas”.

Congratulations to the students and their mentors.  Here’s hoping they’re all here to help out when next year’s interns roll in! :-)  And I’m looking forward to meeting Kevin and Salvatore, and introducing them to other Wikimedia & MediaWiki developers, at the New Orleans developers’ meetup next month.

Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation

MediaWiki 1.18 is coming

[Update 2011-09-24: The initial test deployment and stage 1 have gone well, with only minor glitches that we've mostly cleaned up.  Stage 2 and 3 are currently on schedule.  We've decided to add incubator.wikimedia.org to the list of wikis we'll be deploying to, which is reflected below.]

MediaWiki 1.18 will soon be deployed to all Wikimedia sites, including Wikipedia. As you may know, MediaWiki is the wiki software developed by the Wikimedia community, and 1.18 is the upcoming version of the software that has been in development since December.

Thanks to the completion of the heterogeneous deployment project, we are now able to run different versions of MediaWiki concurrently on Wikimedia sites. This means that we don’t have to upgrade all sites at the same time any more, which should limit the problems we encounter.

The deployment is scheduled to happen in several stages, starting next week:

Wikis in Stage 1 and 2 may experience more issues, so we plan to focus our attention to those wikis during these periods, and be particularly responsive. If you’d like to help make sure we catch problems before we roll out to your wiki, please help us test, by trying out the test wiki starting Tuesday, and report the issues you find.

(more…)

Filter preventing abusive edits comes to all wikis

The AbuseFilter extension for MediaWiki, which helps prevent vandalism on wikis, will be globally enabled on all Wikimedia projects later today.

AbuseFilter was developed by Andrew Garrett with support from the Wikimedia Foundation; it was first enabled on the English Wikipedia in March 2009.

Since then, many local wiki communities have asked individually for AbuseFilter to be turned on on their wiki. As of July 2011, AbuseFilter was already enabled on 66 wikis, out of the 843 wikis the Wikimedia Foundation hosts.

It recently appeared it would just be simpler to enable AbuseFilter by default on all wikis, rather than doing it on request.

When enabled, AbuseFilter comes with no built-in default filters, so no immediate change will be visible on wikis where it is enabled.

Contrary to other anti-vandalism tools, AbuseFilter works by analyzing edits before they’re saved, rather than trying to identify (and revert) them after the fact.

Filters, or “rules”, can be added to AbuseFilter to identify certain kinds of edits matching a pattern. Actions can be taken for these edits, like tagging the edit, preventing the user from saving the page, or even automatically blocking the user. The AbuseFilter documentation provides the format in which filters must be written.

A screenshot of the list of AbuseFilter rules on the English Wikipedia

AbuseFilter catches abusive edits matching defined patterns.

Because AbuseFilter has been in use on the English Wikipedia for more than two years, more details about how AbuseFilter works are available in their documentation; Instructions on how to create a filter are also available.

It is possible to export filters from a wiki, and to import them into another one.

AbuseFilter is an extremely powerful tool, with the potential of preventing edits, blocking users, and making a whole wiki unusable. Therefore, it must be used with extreme caution; filters should only be created and edited by administrators who understand their purpose and syntax.

AbuseFilter can also be used to identify edits that are not abusive, for tracking purposes. Tags can be automatically added to edits matching a certain pattern, thus giving editors and patrollers a heads-up about certain edits (see examples).

Because such tags can also be used to identify legit edits, AbuseFilter is sometimes referred to as “Edit filter”.

AbuseFilter offers the possibility for certain filters to be private, to prevent long-time abusers from knowing how their edits are being identified.

We hope this tool will prove useful to our community of editors and patrollers.

Guillaume Paumier
Technical communications manager

MediaWiki’s Google Summer of Code students halfway through projects

MediaWiki’s Google Summer of Code students have been busy! We’re more than halfway through the summer, so here’s what they’re up to:

Google Summer of Code logo 2011

MediaWiki is participating in Google Summer of Code 2011.

  • Akshay Agarwal’s “Account Creation, Login Screens and AJAX-ification of everything” (mentor: Brandon Harris). Code, project status.
    The last task I accomplished: “Added source tracking functionality in the account creation API that I am building.”
    Something I’ve learned: “True learning can happen only in an open environment & with a highly supportive community.”
  • Kevin Brown’s “Working Archival for Web References/Citations,” “to facilitate the archival of external links used as references in the English Wikipedia” (mentor: Neil Kandalgaonkar). Code, project notes.
    The last task I accomplished: “Adding support for wget local archival, currently working on feed for external archival services.”
    Something I’ve learned: “Where do I start? A lot. I think the biggest thing is probably managing a large project and time management, which I still have a lot to learn on.”
  • Devayon Das’s “Improving Semantic Search/Semantic Query usability issues in SMW” (mentor: Markus Krötzsch). Code, project notes.
    The last task I accomplished: “Added RSS links to the results generated by the Query Creator interface I’m building.”
    Something I’ve learned: “A 30 second chat with a community member can save you 30 minutes of scratching your head in frustration.”
  • Ankit Garg’s “Semantic Schemas extension” (mentor: Yaron Koren). Code.
    The last task I accomplished: “I finished adding the inheritance support to the PageSchema XML structure.”
    Something I’ve learned: “I have a learned a great deal of PHP; also how to manage a huge project.”
  • MediaWiki logo

    "A 30 second chat with a community member can save you 30 minutes of scratching your head in frustration."


    Salvatore Ingala’s “AMICUS: Awesome Monolithic Infrastructure for Customization of User Scripts” (mentors: Max Semenik and Brion Vibber). Code, project notes.
    The last task I accomplished: “I made a prototypal user interface for editing preferences of an existing gadget, HotCat.”
    Something I’ve learned: “Unit testing is boooooring, but ends up saving you a lot of time!”
  • Yuvi Panda’s “Making Offline Wikipedia Article Selection Easier with Mediawiki Extensions” (mentor: Arthur Richards). Code, project.
    The last task I accomplished: “Filter articles based on name, quality and importance.”
    Something I’ve learned: “That spending time talking to everyone involved in the process from start to finish (devs, community maintainers, etc.) saves a truckload of time later on.”
  • Zhenya Vlasyenko’s “MediaWiki Extension: SocialProfile – UserStatus feature” (mentor: Jack Phoenix). Code.
    The last task I accomplished: “Internalization of the UserStatus feature with the help of the MakeGlobalVariablesScript hook.”
    Something I’ve learned: “I’ve found out for myself a new ways of data interaction between PHP and Javascript… Convinced that knowing some tricks and hooks can greatly save time.”

Aigerim Karabekova, who was working on extension release management, ran into several delays (including medical issues) and the project has been dropped. We’re glad she made the attempt and wish her the best.

Continued best wishes to Zhenya, Yuvi, Salvatore, Ankit, Devayon, Kevin, and Akshay as they work to make MediaWiki, and the Wikimedia experience, better.  We’re glad to be helping young developers learn how to contribute to our community.

Sumana Harihareswara
Wikimedia Foundation, Volunteer Development Coordinator

MediaWiki 1.17.0

We are proud to announce the first stable release of the 1.17 series.

MediaWiki 1.17 is a very large release that contains many new features and bug fixes. This is a summary of the major changes of interest to users. You can consult the release notes for the full list of changes in this version.

What’s new?

PHP 5.2.3

We now require PHP version 5.2.3 or later. Why? Well, it brings with it some tools for your beloved developers. It was released on June 1, 2007, so we believe this requirement will not be a hassle for administrators. Be sure to check your PHP installation and contact your host if it runs an outdated PHP version.

New installer

The installer now supports many languages!

MediaWiki 1.17 is shipping with a completely redesigned installer to fix a lot of outstanding bugs, clean up the code quality, and make it easier to use. Notably, you can now run upgrades from the web without having to move LocalSettings.php. A couple of other notable changes:

  • The installer can now be fully localized like the rest of the software and contains numerous help dialogs.
  • The installer script directory has been renamed from config/ to mw-config/.
  • You now download your generated LocalSettings.php at install completion, rather than writing it straight to the configuration directory. The previous behavior was a security risk.
  • IBM DB2 and MSSQL support were dropped from the installer.

ResourceLoader

As web browsers have become more capable, the software that MediaWiki runs on them has become more complex. This trend has resulted in developers needing an efficient way to package and deliver code to web browsers.  To address this, MediaWiki 1.17 ships with ResourceLoader: a framework which combines and minifies CSS and JavaScript before delivering them to the web browser.  ResourceLoader improves performance, while also making it easier to write client-side features.  ResourceLoader allows developers to organize scripts, styles, and messages into named modules. Any number of modules can be loaded through a single request, improving page load times. Code is minified automatically and loaded when needed, reducing unnecessary downloads. Other advanced features include the ability embed images in style sheets using data URIs, or automatically flipping horizontal information in style sheets for right-to-left user interfaces.

Category sorting

Category sorting has been drastically improved.

  • Sorting is now case insensitive.
  • Sub-categories, pages and files can now be paged separately.
  • When several pages are given the same sort key, they sort by their names instead of randomly.

Language support

As with every release, MediaWiki 1.17 brings improved support for languages in MediaWiki, with improved translation and features for the many supported languages.

New languages:

  • Moroccan Spoken Arabic (ary)
  • Banjar (bjn)
  • Kabardian (Cyrillic) (kbd-cyrl)
  • Latgalian (ltg)
  • Minangkabau (min)
  • Dutch (informal) (nl-informal)
  • Rusyn (rue)

API

API bug fixes and new features have been added to 1.17, providing more options for input and output.

  • API output can now be formatted by PHP’s var_export() (format type is dbg/dbgfm).
  • An API module was added to list page properties.
  • PARAM_REQUIRED can now be used on parameters, to have the API enforce existence before code even reaches the module.
  • The API now has a Really Simple Discovery module, useful for publishing service information by the API.

API breaking changes

The API contains 3 breaking changes against previous releases:

  • action=patrol now requires POST.
  • The patrol token is no longer the same as edit token.
  • Session keys returned by ApiUpload are now strings instead of integers.

Other

  • Interwiki links in articles are now recorded in a separate table.
  • Users can now add CSS and JS to all skins by using User:<name>/common.css and User:<name>/common.js.
  • Oracle Database support has been improved, and is now ready for beta testing. If you work in an environment where Oracle is readily available, and you can’t get access to MySQL, this may be a useful alternative for you. Please try it out and let us know if it works for you. Oracle support is not yet recommended for use in production.

This blog post is based on the MediaWiki 1.17 wiki page on www.mediawiki.org, which was collaboratively edited. Please see the page history for credits.

Developers go home after productive Berlin hackathon

These people make Wikipedia and MediaWiki awesome.

Most MediaWiki developers who attended the Berlin hackathon this weekend have left the German capital and returned home, after three days of collaborative coding, group discussions, short presentations, and bug fixing.

A lot of work was already accomplished on Friday and Saturday, including presentations on test frameworks, coding of new features, discussions on wikitext parsers, and a usability testing session.

Things were a bit slower on Sunday, but lack of sleep didn’t stop developers from coding and smashing bugs. Brandon Harris gave a short talk about identity, editor retention and social features. Domas Mituzas talked about how to improve performance; Tim Starling followed by discussing adding HipHop support for MediaWiki, and its planned deployment to Wikimedia sites.

Mark Bergsma also gave an overview of the situation of the Wikimedia infrastructure regarding IPv6 (and our participation in IPv6 Day) and Mathias Schindler discussed WebP support. All the live notes taken yesterday are available.

The rest of the day was used to continue to code, discuss and smash bugs. Some groups explored the city before returning home. The day ended with participants hacking and socializing at the C-base.

If you couldn’t attend, the videos of all the talks are available for you to watch (or re-watch). Many pictures of the event are already on Wikimedia Commons, and more will follow. Presentation slides will be added to the hackathon page as they come in.

We hope the live video streaming, real-time note taking, and IRCing / tweeting was useful for remote attendees; please tell us what we did right and what needs improving. We’d love to get feedback on what worked for you, and what didn’t.

We’d like to thank everyone who was involved in making this event awesome, and particularly the participants, who came from all over the world to work together to improve our technical platform.

Many thanks to the team from Wikimedia Deutschland as well, who masterminded the whole event: Nicole Ebber, Daniel Kinzler, Cornelius Kibelka, and the rest of their team.

Participants agreed they were looking forward to more hackathons, in Berlin and elsewhere. We’ll see you there!


Guillaume Paumier

Photo from Wikimedia Commons by Tobias Schumann, under CC-by-sa 3.0 Germany.

Berlin hackathon continues with group coding, discussions and bug squashing

With tired eyes, and fueled by ridiculously large amounts of coffee, Wikimedia developers and engineers are now starting their third and last day of collaborative coding at the Berlin “hackathon”.

The event, organized by Wikimedia Deutschland, has been going on since Friday. About a hundred participants are enjoying our third day at coworking / hackspace Betahaus.

Yesterday, more coding happened, and even more bugs were smashed: about 65 since we started on Friday. There remains plenty to work on during this hackathon, though, if you’d like to help.

Saturday afternoon was also devoted to the discussions about the possible evolutions of the MediaWiki parser (see notes), a step towards a visual editor for Wikipedia and other MediaWiki-powered sites. (“Visual editor” seems to have reached consensus as a more social class-neutral replacement for “rich text editor”.)

Yesterday, the hackathon also hosted a usability testing session on the Kiwix offline app, led by Ryan Kaldari. The ops team is continuing its ongoing work on HTTPS & IPv6, and Victor Vasiliev partially implemented a long-awaited feature for Wikimedia wikis: a global watchlist.

The day ended with a party (with free beer and food) organized by our friends from Wikia.

You can take a look at all the live notes taken yesterday. People are also taking photos, and more will follow.

Some talks that were originally scheduled for Saturday are happening today, including Brandon Harris’ short presentation on “identity”, Mark Bergsma’s on IPv6, and the discussions on performance and HipHop, with Domas Mituzas and Tim Starling.

You can participate remotely in real time by watching the live video stream (all talks are recorded), and participating in our live note-taking in Etherpad.

You can also join us on IRC in #mwhack11 or #mediawiki on Freenode, and follow our activity using the #mwhack11 hashtag on Twitter and Identi.ca.

This year’s motto is “talk less, code more”. Happy coding!


Guillaume Paumier