Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘hackathon’

Lua previewed

The Berlin hackathon 2012 brought a record number of people together who worked together on many technical issues. Some people came to learn about MediaWiki, some came to learn about the finer points of Git and Gerrit. The great thing about MediaWiki hackathons is that typically there is a great mix of knowledgeable people, talented people and people who can explain and help with difficult technical issues. It is also where new technologies are previewed, this time it was Lua who was getting a lot of the limelight.
It is with pleasure to share with you with what theDJ has to say in answer to questions about the hackathon and Lua.
What is the attraction of a hackathon and, what was special about Berlin 2012

For me as a volunteer the benefit of such an event is twofold. The first part is of course getting to know the people that you usually only interact with online. It’s just more fun and the connections you build are simply stronger. It often also helps you in your future online communications with these people. When you know people in person you also tend to communicate better online.

The other reason is that it is a great way to do learning, brainstorming, rapid prototyping and getting questions asked and answered efficiently. Nothing beats being in the same room when discussing or working on a topic.

There were several themes in the presentations and workshops … you chose Lua, what is Lua and what is its relevance

The complexity of pages is actually one of our biggest performance issues right now and the [[en:Barack Obama]] page is a well known example of that. After an edit of that page it often takes well over 20 seconds for the server to render the page again. This is creating a huge resource load on the server and it is confusing the editors because it seems like the server is not responding to their edits.

The complexity is caused by two things you can use in pages: templates and parser functions. The performance of these elements is shaky, for a large part because our inventive MediaWiki users have found ingenious yet complex forms of working around the limited functionality these two elements provide.
Ideally much of the functionality would be converted in PHP MediaWiki extensions, but that development path is much slower and less accessible for MediaWiki users. For years there have been discussions in the developer community on how to tackle this problem, but a more clear consensus is starting to form now. The idea is to move away from the old templates and parserfunctions combination and replace much of it with a new type of code named Lua, which is still accessible for users, much more capable than templates and parser functions  yet much easier than PHP extensions.

Overall Lua has the promise of a much higher performance and flexibility compared to templates and parserfunctions, yet will allow us to have the same type of safeguarding at the serverside that is so important for a major website like Wikipedia.

When Lua is scheduled for 2013, why all this attention now

Exactly because it is not yet deployed yet. Right now we can still make significant changes easily without causing too much trouble for users. But to know what changes are needed, you do need to use the system and learn from that usage. By engaging the developer community to experiment with writing templates and converting templates, we can find issues that are still outstanding or that were simply never anticipated when implementing the system, before it goes into wider deployment.

Simply said, because the existing templates and parser functions that are in use right now on all these different MediaWiki’s are so complicated. It will take years to replace all the code, so in order to reap the benefits as soon as possible, you will want to tackle the most complex code that currently performs the worst early on in the conversion.

You have been converting the “coordinates” template, what is its attraction

The “Coord” template is a real life example of a template with high complexity that is used on tens of thousands of pages. Exactly the type that in theory should benefit greatly from conversion to Lua. At the same time it is still ‘small’ enough to actually get done within a reasonable amount of time. The proces of converting it instead of writing something from ‘scratch’ will likely mimic the way users will start when using the new Lua capabilities and was therefore important to test.

I have currently spent about 9 hours on it, and am probably about half way the conversion. After doing a full conversion I would like to benchmark the difference between the two implementations so we can further validate our suspicions of the real world benefits of this new Lua method. A partial conversion of the template seems to have already sped it up by at least 4x in my preliminary assessments.

How will this functionality become available on the other 270+ Wikipedias

Lua is now available on Wikimedia labs for testing and this will be followed by gradually adding mediawiki.org and other ‘low priority” production sites. There are still major parts of the extension that require attention before it is ready for a general release.

In terms of the scripts themselves the users will probably start with the most resource ‘expensive’ templates on English Wikipedia and slowly work their way trough at every time trying to keep everything as compatible with the old systems as needed.

Should we not implement the lessons of “Gadgets 2.0” and share them from a central site ?

I think having a centralized Lua module repository, similar to the central Gadget repository for Javascript that we will soon have, is something we should definitely consider. Past experiences with scripts developed by users has taught us that it is a maintenance hell because people fork and adapt the code for every single wiki. Though most of those copies are 95% the same code, they are not actually the same script and if you want to change something to them, you need to either go trough 270 wiki’s or people invest valuable time into fixing a problem that someone else has already fixed at another wiki.

For the lua modules I think it is very important to be able to share that 95% of code that will be the same on all the wiki’s. This is currently not yet possible, but has been discussed about. It is my opinion that we really need to get that working before a 2013 full deploy.

Several people were hacking Lua code, even more people attended the workshop, what is the most relevant thing for them to do moving forward

Provide feedback based on their experiences. As I see it, this is a learning stage and as a group we can only take all lessons into account if we share what each and everyone has learned.

You identified two parts to converting templates to Lua, the conversion itself and optimisation. How relevant will optimisation be?

As I said earlier, the users have found ingenious but complex ways around the limitations of templates and parser functions. A conversion is about changing from one language to the other, without change HOW the code works. This conversion will probably already provide large speed gains.
Optimizing is about getting rid of all the weird constructs that we used because we worked around the limitations of templates and parser functions. These constructs are no longer required and will actually slow down the Lua script, so you will want to remove them.

You use Lua in your day job. In what way is Lua for MediaWiki different from the Lua that you know?

Not so much actually. Of course there is the interface towards MediaWiki which is different from the interface that I work with (an interface to write mobile applications) but the language is exactly the same.

It could have been the first question, what benefit will Lua bring us

It will speed up pages, but make it possible to do even more advanced templating. At the same time it will look a bit less scary to editors, and will create more readable code that is easier to maintain.

Diverse Wikimedia tech crowd gathers in Berlin

A diverse crowd of engineers, volunteer developers, template writers, gadget maintainers and bot operators have gathered in Berlin for this coding event.

The Wikimedia technical community is gathering this week-end in Berlin to code, discuss, learn and generally improve the infrastructure and tools behind Wikipedia and its sister sites.

The “Berlin Hackathon 2012“, a yearly coding event happening in the German capital since 2009, is co-organized by Wikimedia Deutschland. It is taking place this year at STATION Berlin, a former postal train station now converted to an exhibition and event hall. It was preceded by a two-day summit on Wikidata and RENDER, projects aiming to integrate structured data with Wikipedia.

The crowd of about 130, coming from 30 countries, includes Wikimedia Operations engineers, who maintain the site’s hardware and network infrastructure, and MediaWiki developers, who build and improve the software powering all Wikimedia sites. Topics of discussion and workshops include the new code review workflow after the move of the primary code repository to Git, in an attempt to facilitate contributions.

“This is the largest Wikimedia coding event we’ve ever had,” started Sumana Harihareswara, Engineering Community Manager at the Wikimedia Foundation, as she opened the event. Sumana, the lead organizer of the event, has reached out to many Wikimedian technologists outside the core circle of MediaWiki developers, to make the event more inclusive. Indeed, many attendees are joining a Wikimedia coding event for the first time.

The hackathon focuses on recent developments that are enabling more community-driven innovation.

For example, expert Wikipedia editors have been invited to learn about Lua scripting, which is poised to replace or supplement MediaWiki templates later this year.

Templates, originally introduced as a way to embed standardized texts in several Wikipedia articles, have been combined by Wikipedians with parser functions to create a limited programming language. They have become a performance bottleneck for long Wikipedia articles embedding many of them, like Barack Obama, for which a new version of the page can take up to 40 seconds to be generated by the server.

By replacing complex templates by simpler ones augmented with Lua scripts, developers aim to provide editors with a proper scripting language that will be both more powerful and more efficient than ad-hoc ParserFunctions-based logic. Tutorials are being held in Berlin to help end-users learn about Lua and how to adapt templates to this new technology.

Another group of attendees is “Gadgets” maintainers, who have come to Berlin to learn how to adapt their tools to a new version of the software, called “ResourceLoader 2.0“. It will make it possible to centralize custom JavaScript snippets developed by editors, and share them across Wikimedia sites for other communities to use.

Developers and engineers, used to collaborate online, are also using this opportunity to socialize and discuss face-to-face.

“We’ve never had so much activity in our technical community” explains Erik Möller, VP of Engineering and Product Development. “In Berlin, we’re systematically raising awareness of all the recent developments that are enabling more community-driven innovation than ever before. It’s a great time to be a Wikimedian.”

Besides coding, workshops and presentations, the event is also an opportunity for Wikimedia technologists, who usually collaborate over the internet, to mingle and socialize. They will next meet in July in Washington, D.C., for the annual Wikimania conference and its very own Hackathon. By then, they are expected to unveil the first working prototype of the Visual Editor, the upcoming user-friendly interface to edit Wikipedia articles.

Guillaume Paumier
Technical communications manager

The #MediaWiki #hackathon in Pune, #India

When good people get together in a friendly, well organised setting like this weekend in Pune, many great things happen. Several MediaWiki developers had come to provide the many people new to MediaWiki with their expertise and guide people into its inner workings.

Many people worked on Wikimedia mobile and the SmartPhone software, others worked on MediaWiki and its extensions. Bugs got fixed and functionality got extended.

One of the surprises was two people working on the localisation for the Mongolian language. The inclusion of a web font that will support the Dzonka language is another.

Dzongkha is the official language of Bhutan and according to Ethnologue, the script used is either Tibetan script, Uchen style or the Tibetan script, Umed style. These scripts and styles are also used for the Tibetan language, it is not only Dzongkha that stands to benefit.

One of the highlights of the work on the SmartPhone app is support for scripts that are written from right to left, this is now “beta” functionality. The result of more people looking at the code was that several bugs received the attention needed to make them go away. Scrolling was one area that got attention; this results in a smoother user experience.

New input methods have been created for Punjabi transliteration and for an Gujarati input method to be included in Narayam. The continued collaboration with RedHat engineers ensures that our work benefits both MediaWiki and RedHat/Fedora. We do realise that there is still a lot to do and it is not only documentation. Additional work was done on the “visual on-screen keyboard” that was started at the previous hackathon in Pune, it still needs more testing and design work.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Techies learn, make, win at Foundation’s first San Francisco hackathon

Participants at the San Francisco hackathon in 2012

Participants at the San Francisco hackathon in January 2012

In January, 92 participants gathered in San Francisco to learn about Wikimedia technology and to build things in our first Bay Area hackathon.

After a kickoff speech by Foundation VP of Engineering Erik Möller (video), we led tutorials on the MediaWiki web API, customizing wikis with JavaScript user scripts and Gadgets, and building the Wikipedia Android app.  (We recorded each training; click those links for how-to guides and videos.)  We asked the participants to self-organize into teams and work on projects.  After their demonstration showcase, judges awarded a few prizes to the best demos.

(more…)

The Mumbai hackathon was sweet

When a hackathon is organised, it is wonderful when the reality of the results exceeds expectations. The reality was that some of India’s best and brightest attended the hackathon. They represented many of the languages  of India, and it showed.

Seven Indians and a German created an input method for their language. A Russian keyboard method is promised for the next day. There was a jQuery wizard who created a wonderful and necessary addition to the Narayam extension: a visual cue to where the characters are on the keyboard. This information comes directly from the Narayam definitions and the best part is that the visual cue actually works as well.

The WebFonts extension got its reality check. WebFonts provides default fonts in order to ensure that nobody sees the infamous Unicode squares and numbers instead of the desired characters. The MediaWiki software is exclusively open source, and consequently the fonts we deliver through the WebFonts extension need to be freely licensed, too.  The default font we use for the Indic languages is the Lohit font produced by Red Hat. It was quite astonishing to learn that some of the characters are not what the character should look like. Bugs have been filed for this at Red Hat and more work will be done.

We are going to roll out the WebFonts extension on December 12th. Our aim is to install it on the Indic projects. When we have freely licensed fonts that show languages correctly, we will finally be able to provide readable content to everyone. We will be working towards resolving the issues identified at the hackathon.

The Mumbai hackathon has also been good for the Kiwix off-line reader; not only was the software localised into several languages, new developers also familiarized themselves with the software itself to implement further improvements. This is quite important because many Indian people have no or intermittent access to the Internet. In addition to Wikipedia content, there are many projects in India to transcribe books that are in the public domain; as the Kiwix software gets ready to support this content, it will help more and more people get access to India’s rich cultural heritage.

Mobile support was the third centre of gravity; many first-time Wikimedia hackers teamed up with seasoned Wikimedia developers and this produced great results. This included work on a mobile landing page for India, as well as a gateway that allows users to receive Wikipedia articles over SMS and the carrier-specific USSD technology. To appreciate this, many people do not have access to the Internet and consequently to our content. Work also continued on the “Wikipedia Zero” project, which aims to bring Wikipedia and other Wikimedia content to millions of users without data charges.

We also saw an interesting connection with the October 2011 Coding Challenge. Developer Yuvipanda implemented Android 2.2 support for one of the coding challenge submissions, the “Share with Wikimedia Commons” Android app (as well as for the official Wikipedia Android app).

All this will get some review, maybe some polishing but we are quite eager to bring this functionality to you.

Many of the hackers were new to MediaWiki. With an introduction by Erik and private tutoring by Sumana, Tomasz, Patrick, and others, several people really got into the swing of things to the extent that some bugs were smashed.  The hackathon proved as always that when you bring great people together special things can and do happen.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Tech meetup moves Wikimedia infrastructure forward

Earlier this month, about thirty MediaWiki developers and interested technologists gathered in New Orleans to learn and to work on Wikimedia’s technical infrastructure.  We made broad progress on the infrastructure of innovation at Wikimedia (notes).  Specifically:

NOLA Hackathon 16

Tim Starling and DJ Bauch driving towards greater media file storage system independence and robustness

  • We are now much closer to officially opening the doors to Wikimedia Labs and giving far more people the ability to contribute to MediaWiki without having to set up and maintain their own development environments at home.  Wikimedia Labs will provide hosted, virtualized test and development sandboxes for new and experienced programmers and systems administrators.  Many developers got beta Labs accounts, we tested at a larger scale, and we fixed several bugs.
  • Developers agreed to create a file backend abstraction layer to enable large-scale MediaWiki installations to use one of several storage systems to contain big collections of big media files.  (Wikimedia plans on using Swift, which is open source.) Microsoft’s Ben Lobaugh and SAIC’s DJ Bauch collaborated towards improving MediaWiki’s performance on Microsoft technologies as well.  Developers made architectural decisions, refactored some existing code, and improved documentation and tests for the SwiftMedia extension to MediaWiki.
  • Chad Horohoe teaching developers about unit testing

    Chad Horohoe teaching developers unit testing

    We now have a continuous integration server up and running.  This will continuously run tests checking on the latest new features and bugfixes that developers write, resulting in fewer bugs and faster development. Developers will need to write tests to reap the benefits, so Chad Horohoe taught a test-writing workshop.

  • Max Semenik finished and demonstrated the first version of his API Query Sandbox.  This allows software developers anywhere to experiment with ways to automatically get data from Wikipedia or other sites that run MediaWiki, thus enabling wider and deeper reuse of Wikimedia content.
  • Operations folks continued the Puppetization of our infrastructure: they completely reworked Varnish management in Puppet, and worked on Puppet configurations for SwiftMedia testing. This configuration management work will ensure that ops can move faster and more confidently in building and maintaining Wikimedia infrastructure. And Canonical’s Mark Mims and Kapil Thangavelu worked on improving methods for Wikimedia developers “to spin up stacks of services within the labs environment” using Juju (more details).
  • NOLA Hackathon 28

    Brion Vibber leading developers into the "glorious Git future"

    Since the engineering department is planning a switch from Subversion to Git in the next few months, Brion taught nearly everyone there how Git works (slides, audio), and how we’ll be using Git in the future. This change in our source code repository and workflow will, we hope, enable more speed and flexibility in development, both for WMF developers and community contributors.
  • We prioritized and addressed several open requests for the operations team and defect reports about the latest version of MediaWiki, 1.18, which had just been deployed across WMF sites.
  • Roan found and fixed an issue that was spouting symbolic link errors into our Apache logs, so now it’ll be easier for us to see more dangerous errors in those logs.
  • Google Summer of Code students Salvatore Ingala and Kevin Brown made progress on integrating their summers’ work into MediaWiki as used and deployed by others; Salvatore and WMF developer Roan Kattouw have a plan for getting his user scripts improvements reviewed and deployed, so they can benefit Wikimedia readers and editors.
  • A volunteer came in on Friday night knowing nothing about developing for MediaWiki, and by the end of the weekend had a working development environment on her laptop and had some ideas about how to contribute.
  • We had substantive conversations about the summer internship program and about third-party collaboration that will affect how we work in the future.

NOLA Hackathon 1

Launch Pad New Orleans, a great venue

We also ate dinner together, walked Bourbon Street, and generally got to know colleagues we’d never met before.  I expect these relationships will bear fruit for years to come.

Thanks to Ryan Lane and Dana Isokawa for organizing the event with me, and thanks to Launch Pad New Orleans for providing the venue!

Our next developers’ event is a hackathon in Mumbai November 18-20 concentrating on internationalization, localization, and mobile work.  To find out about other upcoming Wikimedia technical events, check the meetings wiki page, and follow @MediaWikiMeet on Identi.ca or Twitter.

Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation

Developers go home after productive Berlin hackathon

These people make Wikipedia and MediaWiki awesome.

Most MediaWiki developers who attended the Berlin hackathon this weekend have left the German capital and returned home, after three days of collaborative coding, group discussions, short presentations, and bug fixing.

A lot of work was already accomplished on Friday and Saturday, including presentations on test frameworks, coding of new features, discussions on wikitext parsers, and a usability testing session.

Things were a bit slower on Sunday, but lack of sleep didn’t stop developers from coding and smashing bugs. Brandon Harris gave a short talk about identity, editor retention and social features. Domas Mituzas talked about how to improve performance; Tim Starling followed by discussing adding HipHop support for MediaWiki, and its planned deployment to Wikimedia sites.

Mark Bergsma also gave an overview of the situation of the Wikimedia infrastructure regarding IPv6 (and our participation in IPv6 Day) and Mathias Schindler discussed WebP support. All the live notes taken yesterday are available.

The rest of the day was used to continue to code, discuss and smash bugs. Some groups explored the city before returning home. The day ended with participants hacking and socializing at the C-base.

If you couldn’t attend, the videos of all the talks are available for you to watch (or re-watch). Many pictures of the event are already on Wikimedia Commons, and more will follow. Presentation slides will be added to the hackathon page as they come in.

We hope the live video streaming, real-time note taking, and IRCing / tweeting was useful for remote attendees; please tell us what we did right and what needs improving. We’d love to get feedback on what worked for you, and what didn’t.

We’d like to thank everyone who was involved in making this event awesome, and particularly the participants, who came from all over the world to work together to improve our technical platform.

Many thanks to the team from Wikimedia Deutschland as well, who masterminded the whole event: Nicole Ebber, Daniel Kinzler, Cornelius Kibelka, and the rest of their team.

Participants agreed they were looking forward to more hackathons, in Berlin and elsewhere. We’ll see you there!


Guillaume Paumier

Photo from Wikimedia Commons by Tobias Schumann, under CC-by-sa 3.0 Germany.

Berlin hackathon continues with group coding, discussions and bug squashing

With tired eyes, and fueled by ridiculously large amounts of coffee, Wikimedia developers and engineers are now starting their third and last day of collaborative coding at the Berlin “hackathon”.

The event, organized by Wikimedia Deutschland, has been going on since Friday. About a hundred participants are enjoying our third day at coworking / hackspace Betahaus.

Yesterday, more coding happened, and even more bugs were smashed: about 65 since we started on Friday. There remains plenty to work on during this hackathon, though, if you’d like to help.

Saturday afternoon was also devoted to the discussions about the possible evolutions of the MediaWiki parser (see notes), a step towards a visual editor for Wikipedia and other MediaWiki-powered sites. (“Visual editor” seems to have reached consensus as a more social class-neutral replacement for “rich text editor”.)

Yesterday, the hackathon also hosted a usability testing session on the Kiwix offline app, led by Ryan Kaldari. The ops team is continuing its ongoing work on HTTPS & IPv6, and Victor Vasiliev partially implemented a long-awaited feature for Wikimedia wikis: a global watchlist.

The day ended with a party (with free beer and food) organized by our friends from Wikia.

You can take a look at all the live notes taken yesterday. People are also taking photos, and more will follow.

Some talks that were originally scheduled for Saturday are happening today, including Brandon Harris’ short presentation on “identity”, Mark Bergsma’s on IPv6, and the discussions on performance and HipHop, with Domas Mituzas and Tim Starling.

You can participate remotely in real time by watching the live video stream (all talks are recorded), and participating in our live note-taking in Etherpad.

You can also join us on IRC in #mwhack11 or #mediawiki on Freenode, and follow our activity using the #mwhack11 hashtag on Twitter and Identi.ca.

This year’s motto is “talk less, code more”. Happy coding!


Guillaume Paumier

Wikimedia developers start second day of Berlin hackathon

Typical traffic lights in Berlin

Green light: You can code now!

MediaWiki developers and Wikimedia engineers are starting their second day of coding, discussing and bug-smashing today in Berlin, Germany. This “hackathon”, organized by Wikimedia Deutschland, started yesterday, and will last until tomorrow Sunday.

After a short introduction yesterday, participants quickly moved on to group discussions, short presentations and coding. The event is run as an unconference, and this format has proven to be quite effective so far.

Lightning talks yesterday included presentations about the new datacenter (by Mark Bergsma), Kiwix and offline (Emmanuel Engelhart), PhotoCommons (Hay Kranen), OpenStreetMap integration (Tim Adler), WikiLove (Ryan Kaldari), PHPunit (Ashar Voultoiz), the new mobile gateway (Patrick Reilly), community-oriented testing (Ryan Lane), Narayam (Purodha Blissenbach) and distributed JavaScript testing (Timo Tijhof).

Several bugs were also fixed yesterday, but there remains quite a bit to smash during this hackathon.

A lot of group discussions (e.g. about HipHop, and the MediaWiki release plan) and actual coding happened during the afternoon and evening. You can take a look at all the notes taken yesterday in real time.

Today’s talks include discussions on “Identity” (Brandon Harris), performance, including plans to use HipHop for PHP (Domas Mituzas and Tim Starling), as well as many discussions and short talks about wikitext parsers.

To participate remotely in real time: You can still watch the live video stream (all talks are recorded), and participate in our live note-taking in etherpad.

You can also join us on IRC in #mwhack11 or #mediawiki on Freenode, and follow our activity using the #mwhack11 hashtag on Twitter and Identi.ca.

Another way to participate is by testing some of the tools people are developing. For example, Purodha Blissenbach is looking for testers for Narayam (a keyboard mapping for Indic languages), and Hay Kranen would like people to test the PhotoCommons WordPress plugin. Please contact them if you want to get involved.

This year’s motto is “talk less, code more”. Happy coding!


Guillaume Paumier

Wikimedia tech crowd and MediaWiki developers gather in Berlin

Developers, engineers, laptops, food, and wi-fi.

MediaWiki developers and Wikimedia engineers have flown from all over the world to meet up in Berlin.

For the third time, Wikimedia Deutschland is organizing a “hackathon”, a coding event where developers work together to improve the MediaWiki software and the technological platform for Wikipedia.

The event started today at the betahaus, a coworking and social space in Berlin’s Kreuzberg neighborhood, close to Moritzplatz. It will last until Sunday; a rough schedule is available.

Two other groups of Wikimedians are also meeting this week-end at the betahaus: Wiki loves Monuments enthusiasts, and the Language Committee.

Work at the hackathon is notably focused on improvements in MediaWiki’s text editor, development tools, improvements in the parser, mobile apps, and bug fixing. We’re also having a few lightning talks.

These developer days are included in the program of the Berlin Web Week, a series of events happening in Berlin in May and bringing together Internet and software communities and industry players.

To participate remotely: join us on Twitter and Identi.ca, where we’re using the #mwhack11 hashtag. We’re posting links there to our public notes taken in real time.

You can also watch the live video stream (all talks are recorded), join us on IRC in #mwhack11 or #mediawiki on Freenode, and check out the event page on facebook.

This year’s motto is “talk less, code more”. Happy coding!


Guillaume Paumier