Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘templates’

What Lua scripting means for Wikimedia and open source

Yesterday we flipped a switch: editors can now use Lua, an innovative programming language, to generate sections of wiki pages on all our sites. We’d like to talk about what this means for the open source community at large, for Wikimedians, and for our future.

Why we did this

In the old wikitext templating system, this is part of Template:Citation/core. Any Wikipedia article citing a source will cause our CPUs to run through this instructionset. With Lua, we’ll be able to replace this.

When we started digging into the causes of slow pageload times a few years ago, we saw that our CPUs ate a lot of time interpreting templates — useful bits of markup that programmatically told MediaWiki to reuse little bits of text. Templates are everywhere on our sites. Good Wikipedia articles heavily use the citation templates, for instance, and you’ve seen the ubiquitous infoboxes on every biography. In fact, editors can write code to generate substantial portions of wiki pages. Hit “View source” sometime to see how.

But, because we’d never planned for wikitext to become a programming language, these templates were terribly inefficient and hacky — they didn’t even have recursion or loops — and were terrible for performance. When you edit a complex article like Tulsi Gabbard, with scores of citations, it can take up to 30 seconds to parse and display the page. Even as we worked to improve performance via caching, query profiling, new hardware, and other common means, we sometimes had to advise our community to remove functionality from a particular template so pages would render faster.

This wouldn’t do. It was a terrible experience for our users and especially hard for our editors, who had to wait for a multi-second roundtrip after every “how would this page look?” preview.

So our staffers and volunteers worked on Scribunto (from the Latin for “they shall write”), a MediaWiki extension to allow editors to embed Lua scripts instead of wikitext for templating. And volunteers and Foundation staffers have already started identifying pages that are slow to render and converting the most inefficient templates. We have 488,731 templates on English Wikipedia alone right now. The process of turning many of those into Lua scripts is going to affect everyone who reads our sites — and the Scribunto project has already started giving back to the Lua community.

Us and Lua

For instance, our engineer Brad Jorsch wrote mw.ustring.lua, a Unicode module reusable by other Lua developers. This library is good news for people who write templates in non-Latin characters, and for anyone who wants a version of Lua’s standard String library where the methods operate on characters in UTF-8 encoded strings rather than bytes.

And with Scribunto, we empower those frustrated Wikimedians who have been spending years breaking their knuckles making amazing things in wikitext; as they learn how much easier it is to script in Lua, we hope they’ll be able to use those skills in their hobbies, schools, and workplaces. They’ll join forces with the graduates of Codecademy, World of Warcraft, and the other communities that teach anyone to program. New programmers with basic knowledge of computer science who want to do something real with their new skills will find that Lua scripting on Wikimedia sites is a logical next step for them. Our implementation only differs slightly from standard Lua.

And since Scribunto is an extension that any MediaWiki administrator can install, we hope the MediaWiki administrators out there will enjoy using Lua to more easily customize their wikis for their users.

Structured data and new ways to display it

Scribunto lays the foundations for exciting work to come when the Wikidata structured data project comes further online (the Wikidata interface is still in development and being deployed in phases). We know that Lua will be an attractive way to integrate Wikidata information into pages, and we hope a lot of (currently) unstructured data will get structured, helping new applications emerge.

Now that Lua and Wikidata are more mature, we can look forward to enabling more functionality and plugging in more libraries. And as we continue deploying Wikidata, people will make interesting improvements that we currently can’t predict. For instance, right now, each citation is hard to programmatically dissect; the Cite template takes many unstructured parameters (“author1,” “author2,” etc.) We structure these arguments by convention, but the data’s not structured as CS folks would have it, and can’t be queried via APIs, remixed, and so on.

Excerpt of Coordinates module

A screenshot of part of the new Coordinates module, written in Lua by User:Dragons flight. Note that, with Lua, we can actually use proper conditionals.

But in the future, we could have citations stored in Wikidata and then put together onto article pages using Lua, or even assembled into other various reasonable forms (automatically generated bibliographies?) using Lua, and it will be more easy for Zotero users to discover. That’s just one example; on all our sites over the next few years, things will change from the status quo in a user-visible way. The old math and geography templates were inefficient and hard to hack; once rewritten, they’ll run faster and perhaps editors will use them more. We might see galleries, automatic data analyses, better annotated maps, and various other interesting processes and queries embedded in Wikimedia pages.

Open for change

Wikimedians have been writing wikitext templates for years, and doing hard, astounding, unexpected things with them for readers to enjoy. But the steep learning curve drove contributors away. With Lua, a genuine programming language, people now have a deeper and more useful foundation to build upon. And for years, power users on our sites have customized their experiences with JavaScript/CSS Gadgets and user scripts, but those are basically one level above skins preferences; other people won’t stumble upon your hacks in the process of reading an article.

So, now is the first time that the Wikimedia site maintainers have enabled real coding that affects all readers. We’re letting people program Wikipedia unsupervised. Anyone can write a chunk of code to be included in an article that will be seen by millions of people, often without much review. We are taking our “anyone can edit” maxim one big step forward.

If someone doesn’t like the load time of a webpage, they can now actually improve it themselves. Just as we crowdsourced building Wikipedia, now we’re crowdsourcing bits of infrastructure improvement. And this kind of massively multiplayer, crowdsourced performance improvement is uniquely us.

Wikitext templates could do a lot of things, but Lua does them better and faster, and now mere mortals can do it. We’re aiming to help our users learn to program, to empower themselves, and to help each other and help our readers.

We hope you’ll join us.

Sumana Harihareswara, Engineering Community Manager

New Lua templates bring faster, more flexible pages to your wiki

Starting Wednesday, March 13th, you’ll be able to make wiki pages even more useful, no matter what language you speak: we’re adding Lua as a templating language. This will make it easier for you to create and change infoboxes, tables, and other useful MediaWiki templates. We’ve already started to deploy Scribunto (the MediaWiki extension that enables this); it’s on several of the sites, including English Wikipedia, right now.

You’ll find this useful for performing more complex tasks for which templates are too complex or slow common examples include numeric computations, string manipulation and parsing, and decision trees. Even if you don’t write templates, you’ll enjoy seeing pages load faster and with more interesting ways to present information.

Background

The text of English Wikipedia’s string length measurement template, simplified.

MediaWiki developers introduced templates and parser functions years ago to allow end-users of MediaWiki to replicate content easily and build tools using basic logic. Along the way, we found that we were turning wikitext into a limited programming language. Complex templates have caused performance issues and bottlenecks, and it’s difficult for users to write and understand templates. Therefore, the Lua scripting project aims to make it possible for MediaWiki end-users to use a proper scripting language that will be more powerful and efficient than ad-hoc, parser functions-based logic. The example of Lua’s use in World of Warcraft is promising; even novices with no programming experience have been able to make large changes to their graphical experiences by quickly learning some Lua.

Lua on your wiki

As of March 13th, you’ll be able to use Lua on your home wiki (if it’s not already enabled). Lua code can be embedded into wiki templates by employing the {{#invoke:}} parser function provided by the Scribunto MediaWiki extension. The Lua source code is stored in pages called modules (e.g., Module:Bananas). These individual modules are then invoked on template pages. The example: Template:Lua hello world uses the code {{#invoke:Bananas|hello}} to print the text “Hello, world!”. So, if you start seeing edits in the Module namespace, that’s what’s going on.

Getting started

The strlen template as converted to Lua.

Check out the basic “hello, world!” instructions, then look at Brad Jorsch’s short presentation for a basic example of how to convert a wikitext template into a Lua module. After that, try Tim Starling’s tutorial.

To help you preview and test a converted template, try Special:TemplateSandbox on your wiki. With it, you can preview a page using sandboxed versions of templates and modules, allowing for easy testing before you make the sandbox code live.

Where to start? If you use pywikipedia, try parsercountfunction.py by Bináris, which helps you find wikitext templates that currently parse slowly and thus would be worth converting to Lua. Try fulfilling open requests for conversion on English Wikipedia, possibly using Anomie’s Greasemonkey script to help you see the performance gains. On English Wikipedia, some of the templates have already been converted  feel free to reuse them on your wiki.

The Lua hub on mediawiki.org has more information; please add to it. And enjoy your faster, more flexible templates!

Sumana Harihareswara, Engineering Community Manager

Template folding

Based on several usability studies, the usability user experience team has identified that template text and syntax is a major hindrance to new users, making them feel less comfortable editing pages.

As such, one approach that we’ve been experimenting with is collapsing templates into expandable “capsules”. This improves the readability of the wikitext.[1]

The full wikitext of the template is available with the expansion arrow. Additionally, a more user-friendly template editing form is available by clicking on the template name or the ‘pop-out’ symbol to the right of the name.

Since this is an experimental feature that is largely proof-of-concept, it does have a few limitations:

  • Currently only works on Firefox with the editing iframe enabled
  • Pasting content into the expanded template (or inserting a newline in Linux) can break the template, depending on the source of the content.
  • The implementation is relatively slow, so slower and older computers can appear to hang, especially on pages with large templates
  • Templates are not converted into capsules as you type; only templates that were there on initial page load are wrapped

We’re still working on these, but in the meantime, test it out on our sandbox[2] and let us know what you think!

[1]We’re working on making the displayed name more customizable on a per-template basis so the collapsed version more accurately summarizes what it’s collapsing, ie displaying the title of an infobox rather than the word “infobox”.

[2]This is currently prepopulated with some articles about large US cities. For some good examples, check out:
New York City, Boston, or Chicago

Nimish Gautam, Research Analyst

On templates and programming languages

As many folks have noted, our current templating system works ok for simple things, but doesn’t scale well — even moderately complex conditionals or text-munging will quickly turn your template source into what appears to be line noise…

<includeonly><span style="white-space: nowrap;">{{#if:{{{3|}}}|
{{coord|{{{1|0}}}|{{{2|0}}}|{{{3|0}}}|{{{4|N}}}|{{{5|0}}}|{{{6|0}}}|{{{7|0}}}|{{{8|E}}}|{{{9|type:other}}}|format={{{format|dms}}}|display={{#if:{{{title|}}}|inline,title|inline}} }}| {{#if:{{{2|}}}|
{{coord|{{{1|0}}}|{{{2|0}}}|{{{4|N}}}|{{{5|0}}}|{{{6|0}}}|{{{8|E}}}|{{{9|type:other}}}|format={{{format|dms}}}|display={{#if:{{{title|}}}|inline,title|inline}}}}| {{#if:{{{4|}}}|
{{coord|{{{1|0}}}|{{{4|N}}}|{{{5|0}}}|{{{8|E}}}|{{{9|type:other}}}|format={{{format|dec}}}|display={{#if:{{{title|}}}|inline,title|inline}}}}| {{#if:{{{1|}}}|
{{coord|{{{1|0}}}|{{{5|0}}}|{{{9|type:other}}}|format={{{format|dec}}}|display={{#if:{{{title|}}}|inline,title|inline}}}}}}}}}}}}</span></includeonly><noinclude>
{{pp-template|small=yes}}
{{documentation}}
</noinclude>

And we all thought Perl was bad!  ;)

Lua

There’s been talk of Lua as an embedded templating language for a while, and there’s even an extension implementation.

One advantage of Lua over other languages is that its implementation is optimized for use as an embedded language, and it looks kind of pretty.

An inherent disadvantage is that it’s a fairly rarely-used language, so still requires special learning on potential template programmers’ part.

An implementation disadvantage is that it currently is dependent on an external Lua binary installation — something that probably won’t be present on third-party installs, meaning Lua templates couldn’t be easily copied to non-Wikimedia wikis.

There are perhaps three primary alternative contenders that don’t involve making up our own scripting language (something I’d dearly like to avoid):

PHP

  • Advantage: Lots of webbish people have some experience with PHP or can easily find references.
  • Advantage: we’re pretty much guaranteed to have a PHP interpreter available.  :)
  • Disadvantage: PHP is difficult to lock down for secure execution.

JavaScript

  • Advantage: Even more folks have been exposed to JavaScript programming, including Wikipedia power-users.
  • Disadvantage: Server-side interpreter not guaranteed to be present. Like Lua, would either restrict our portability or would require an interpreter reimplementation. :P

Python

  • Advantage: A Python interpreter will be present on most web servers, though not necessarily all. (Windows-based servers especially.)
  • Wash: Python is probably better known than Lua, but not as well as PHP or JS.
  • Disadvantage: Like PHP, Python is difficult to lock down securely.

Any thoughts? Does anybody happen to have a PHP implementation of a Lua or JavaScript interpreter?  ;)

– brion

Update:

Hampton reminds me that Ruby has some sandboxing features and may also be a contender.