Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Lua previewed

The Berlin hackathon 2012 brought a record number of people together who worked together on many technical issues. Some people came to learn about MediaWiki, some came to learn about the finer points of Git and Gerrit. The great thing about MediaWiki hackathons is that typically there is a great mix of knowledgeable people, talented people and people who can explain and help with difficult technical issues. It is also where new technologies are previewed, this time it was Lua who was getting a lot of the limelight.
It is with pleasure to share with you with what theDJ has to say in answer to questions about the hackathon and Lua.
What is the attraction of a hackathon and, what was special about Berlin 2012

For me as a volunteer the benefit of such an event is twofold. The first part is of course getting to know the people that you usually only interact with online. It’s just more fun and the connections you build are simply stronger. It often also helps you in your future online communications with these people. When you know people in person you also tend to communicate better online.

The other reason is that it is a great way to do learning, brainstorming, rapid prototyping and getting questions asked and answered efficiently. Nothing beats being in the same room when discussing or working on a topic.

There were several themes in the presentations and workshops … you chose Lua, what is Lua and what is its relevance

The complexity of pages is actually one of our biggest performance issues right now and the [[en:Barack Obama]] page is a well known example of that. After an edit of that page it often takes well over 20 seconds for the server to render the page again. This is creating a huge resource load on the server and it is confusing the editors because it seems like the server is not responding to their edits.

The complexity is caused by two things you can use in pages: templates and parser functions. The performance of these elements is shaky, for a large part because our inventive MediaWiki users have found ingenious yet complex forms of working around the limited functionality these two elements provide.
Ideally much of the functionality would be converted in PHP MediaWiki extensions, but that development path is much slower and less accessible for MediaWiki users. For years there have been discussions in the developer community on how to tackle this problem, but a more clear consensus is starting to form now. The idea is to move away from the old templates and parserfunctions combination and replace much of it with a new type of code named Lua, which is still accessible for users, much more capable than templates and parser functions  yet much easier than PHP extensions.

Overall Lua has the promise of a much higher performance and flexibility compared to templates and parserfunctions, yet will allow us to have the same type of safeguarding at the serverside that is so important for a major website like Wikipedia.

When Lua is scheduled for 2013, why all this attention now

Exactly because it is not yet deployed yet. Right now we can still make significant changes easily without causing too much trouble for users. But to know what changes are needed, you do need to use the system and learn from that usage. By engaging the developer community to experiment with writing templates and converting templates, we can find issues that are still outstanding or that were simply never anticipated when implementing the system, before it goes into wider deployment.

Simply said, because the existing templates and parser functions that are in use right now on all these different MediaWiki’s are so complicated. It will take years to replace all the code, so in order to reap the benefits as soon as possible, you will want to tackle the most complex code that currently performs the worst early on in the conversion.

You have been converting the “coordinates” template, what is its attraction

The “Coord” template is a real life example of a template with high complexity that is used on tens of thousands of pages. Exactly the type that in theory should benefit greatly from conversion to Lua. At the same time it is still ‘small’ enough to actually get done within a reasonable amount of time. The proces of converting it instead of writing something from ‘scratch’ will likely mimic the way users will start when using the new Lua capabilities and was therefore important to test.

I have currently spent about 9 hours on it, and am probably about half way the conversion. After doing a full conversion I would like to benchmark the difference between the two implementations so we can further validate our suspicions of the real world benefits of this new Lua method. A partial conversion of the template seems to have already sped it up by at least 4x in my preliminary assessments.

How will this functionality become available on the other 270+ Wikipedias

Lua is now available on Wikimedia labs for testing and this will be followed by gradually adding mediawiki.org and other ‘low priority” production sites. There are still major parts of the extension that require attention before it is ready for a general release.

In terms of the scripts themselves the users will probably start with the most resource ‘expensive’ templates on English Wikipedia and slowly work their way trough at every time trying to keep everything as compatible with the old systems as needed.

Should we not implement the lessons of “Gadgets 2.0” and share them from a central site ?

I think having a centralized Lua module repository, similar to the central Gadget repository for Javascript that we will soon have, is something we should definitely consider. Past experiences with scripts developed by users has taught us that it is a maintenance hell because people fork and adapt the code for every single wiki. Though most of those copies are 95% the same code, they are not actually the same script and if you want to change something to them, you need to either go trough 270 wiki’s or people invest valuable time into fixing a problem that someone else has already fixed at another wiki.

For the lua modules I think it is very important to be able to share that 95% of code that will be the same on all the wiki’s. This is currently not yet possible, but has been discussed about. It is my opinion that we really need to get that working before a 2013 full deploy.

Several people were hacking Lua code, even more people attended the workshop, what is the most relevant thing for them to do moving forward

Provide feedback based on their experiences. As I see it, this is a learning stage and as a group we can only take all lessons into account if we share what each and everyone has learned.

You identified two parts to converting templates to Lua, the conversion itself and optimisation. How relevant will optimisation be?

As I said earlier, the users have found ingenious but complex ways around the limitations of templates and parser functions. A conversion is about changing from one language to the other, without change HOW the code works. This conversion will probably already provide large speed gains.
Optimizing is about getting rid of all the weird constructs that we used because we worked around the limitations of templates and parser functions. These constructs are no longer required and will actually slow down the Lua script, so you will want to remove them.

You use Lua in your day job. In what way is Lua for MediaWiki different from the Lua that you know?

Not so much actually. Of course there is the interface towards MediaWiki which is different from the interface that I work with (an interface to write mobile applications) but the language is exactly the same.

It could have been the first question, what benefit will Lua bring us

It will speed up pages, but make it possible to do even more advanced templating. At the same time it will look a bit less scary to editors, and will create more readable code that is easier to maintain.

Comments are closed.