Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘MediaWiki’

Welcome to FLOSS Outreach Program for Women interns

“Live bird”, an illustration by Kim ‘Isarra’ Schoonover, one of Wikimedia’s interns for the Outreach Program for Women.

I’m glad to announce that Kim Schoonover, Mariya Miteva, Priyanka Nag, Sucheta Ghoshal, Teresa Cho and Valerie Juarez will join the MediaWiki community as full-time interns between January and March 2013. They have been selected as part of the FLOSS Outreach Program for Women, an initiative of the GNOME Foundation, with participation by several other free software projects: Deltacloud, Fedora, JBoss, Mozilla, Open Technology Institute, OpenITP, OpenStack, Subversion, Tor and Wikimedia. A total of 25 women have been selected in this edition.

Each MediaWiki intern will work on a specific project:

Our mentors assisted about 25 women interested in the MediaWiki community during the application process. From those, 10 submitted full applications, including a project proposal and a microtask completed as part of the submission. We were impressed by the quality of many submissions, but we couldn’t take more due to lack of related mentors and funding. We encourage all applicants to stay around in the MediaWiki community, since we have no doubt they can all become top contributors and have future opportunities.

In addition to the availability of specialized mentors and the quality of the proposals, our selection criteria took into account the diversity of profiles, origins and previous involvement in Wikimedia projects. We had a rather open process for submissions and selection of candidates that gave participants a chance to check other proposals, improve their own and receive public endorsements.

We wish a happy landing to our new interns and the best of luck in their projects! You’ll be hearing more from them over the next few months.

Quim GilTechnical Contributor Coordinator

Fix this broken workflow, and help thousands of Wikipedians

In the 10+ years since its founding, Wikipedia has become an indispensable source of quality information for Internet users everywhere. Here at the Wikimedia Foundation, we’re very proud to support such a project. Yet, despite being a household name, there remain some issues with our user experience that are deeply troubling.

This is especially true for the smaller contingent of people who are the regular contributors to the encyclopedia. Wikipedia’s user interface has failed to keep pace with the the encyclopedia’s growth and the lack of a modernized editor experience has contributed to both a decline in the recruitment and retention of editors (a trend that started around 2007).

The Editor Engagement Experiments team tries to reverse this trend by defining, measuring and fixing these important editing workflows, and improving the experience of Wikipedia volunteers who create content. In this post, we’ll show you one of these editing workflows and invite developers to try their hand at implementing a solution.

An example problem

Imagine you want to create an article for English Wikipedia. You begin by searching for the article on Wikipedia and find that there isn’t one on the topic yet.  This is the screen you get.  Can you figure out how to create the article?

The answer is to click on the red link — that’s intuitive, right?

Even if you figure this out, you’re going to have problems. If you don’t have an account (like most readers), you’ll encounter another hurdle: the site will simply tell you that you don’t have permission to create the page. The solution is to create an account, but it doesn’t say that on the page.

Let’s say you register for an account (or log in if you have one) and then get back to the task at hand. Great. But not so much if you’re new to Wikipedia, because all we do is dump a blank text box on you and hope you know what you’re doing. There’s no warning that articles not meeting Wikipedia quality standards will be swiftly deleted. You could start by getting your feet wet by trying out one of the several workflows that are safer for starting a page, but none of these alternatives is presented as an option.

Thousands of people are subjected to this experience every month and all they’re trying to do add to the world’s collective knowledge. If all of this makes you a bit angry, keep reading.

(more…)

Primary data about languages

For MediaWiki, the CLDR or Common Locale Data Repository, is a primary source of information. The information about languages Unicode maintains in this standard is what is most relevant to us. It registers its name in English, as well as the autonym or the name in its own language, as well as information like what a date and a number look like,  the script or scripts used for a language and the names of other languages in that language.

We prefer to use standardised information, not only because it is stable and reliable, but because we do not have to collect the data ourselves and also because the data is used by many other organisations and in many other applications. We love the CLDR and we want it to be even better. To make it better we need your help.

Many of the languages that have a Wikipedia and many of the languages that want to have a Wikipedia are not represented in the CLDR. Many Wikipedians know their language really well. They can provide the information about their language and they can verify that the existing information is correct. When there is a need to change things, you will need to create a user.

When a language is not yet supported, you will have to request for the new locale or language to be added. It is expected that you provide at least the core data when you make your request and that you at least complete the minimal data required. One of the questions is: where the language is official, it may be that a language does not have any official status. This does not prevent people from reading or writing that language and it does not mean that information about such a language is not important to us.

When a language is already supported, we want you to verify if the names for other languages exist and are correctly written. There can be issues in any language including English; using the Auracana name for the Mapundungun language is considered an insult.

When you are able and happy to help us in this way, you may be interested in joining our “language support team.” Because of your interest you belong to the group of people we first want to turn to when we have questions about supporting your language. More structured information and room for your reports can be found here. When there are any issues, do not hesitate to report them.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

First of many MediaWiki 1.20 deployments have begun

The logo of MediaWiki (a yellow sunflower surrounded by two pairs of blue square brackets) with gradients symbolizing its coming to age for the next version

Wikimedia sites will gradually be upgraded to version 1.20 of MediaWiki in April 2012.

Wikimedia engineers have finished up the latest version of MediaWiki, the software that powers Wikipedia and its sister sites. We have begun deploying this version, labeled “1.20wmf1,” to all Wikimedia sites in stages. We started on April 10th and will continue until April 25th.

Yes, we only deployed MediaWiki 1.19 a few weeks ago. This new update is part of our effort to get you fixes and improvements much more regularly (a reason we recently switched from Subversion to Git).

We plan to deploy the latest software every two weeks. Rather than calling each version of the deployed software 1.20, 1.21, 1.22, etc every two weeks, we’ll be using a variation of the “1.20″ moniker for the next few months.

We’re decoupling our deployment process (to Wikimedia sites like Wikipedia) and our release process of standalone MediaWiki installer for use on third-party sites. We plan to have MediaWiki 1.20wmf1 and 1.20wmf2 in April, 1.20wmf3 and 1.20wmf4 in May, etc., until we actually release a new MediaWiki 1.20 installer this fall (probably October).

Only after this point will we start referring to deployments as “1.21″ deployments. The cycle will repeat approximately every six months, with Wikimedia deployments every two weeks, and installer releases every six months.

We’ve already tried out the 1.20wmf1 version on a test wiki and on mediawiki.org, and things are looking good. But the schedule may change based on unexpected issues, so you should refer to the MediaWiki 1.20 roadmap for an up-to-date schedule of when your wiki will be affected.

What’s new

This is a fairly small set of changes, compared to the March deployment of MediaWiki 1.19. This is intended to minimize disruption and possible issues, and make it easier to identify the cause of problems, since the possibly problematic code will be much more recent.

The biggest thing you’ll notice is the new diff style (example on mediawiki.org), designed to improve the experience of color-blind and partially sighted visitors.

More polish you’ll notice: There is a new option on Special:Prefixindex and Special:Allpages to hide redirects (addressing bug 30963). New edit emails for watched pages always provide a link to the edit which triggered the mail (fixing bug 32210).  And “Creating” is now given in the page title instead of “Editing” when you are creating a page (fixing bug 22870).

And, of course, developers have improved the software “under the hood” in many ways. A list of all changes is available in the draft release notes.

Snags and glitches?

If, despite our efforts, you encounter issues due to the upgrade, we’ll try and fix them as soon as we can. Get an account and report issues in our bug tracker, which is where we look for reports of problems. And the faster you tell us about problems, the faster we can address them.

Thanks!

Sumana Harihareswara, Volunteer development coordinator
Rob Lanphier, Director of Platform Engineering
Images contained in this blog post are available under CC-BY-SA

Wikimedia engineering moving from Subversion to Git

Hello, MediaWiki developers and users! You may already be aware of this: our community is embarking on a journey to leave Subversion behind and migrate to Git for our source code repositories, starting on March 3rd (update as of February 29th: moved to March 21st). This is not an easy task. Here I’ll outline our rationale for this move, as well as our planned process.

What is Git?

Git is a distributed version control system originally developed by Linus Torvalds and others to manage the Linux kernel. In the past couple of years, it has taken off as a very robust and well-supported code repository. “Distributed” means that there is no central copy of the repository. With Subversion, Wikimedia’s servers host the repository and users commit their changes to it. In contrast, with Git, once you’ve cloned the repository, you have a fully functioning copy of the source code, with all the branches and tagged releases at your disposal.

Why switch?

Three major reasons:

To encourage participation: Since Git is distributed, it allows people to contribute with a much lower barrier to entry. Anyone will be able to clone the repository and make their own changes to keep track of them. And if you’ve got an account in our code review tool (Gerrit), you’ll be able to push changes for the wider community to review.

To fix our technical process: Subversion has technical flaws that make life difficult for developers. Notably, the implementation of branching is not very easy to use, and makes it hard to use “feature branches”. Our community is very distributed, with many parallel efforts and needs to integrate many different feature efforts, so we’d like to use feature branches more. Git branches are very easy to work with and merge between, which should make things easier for our development community.  (Several other large projects, such as Drupal and PostgreSQL, have made the same switch for similar reasons, and we’ve done our best to learn from their experiences.)

Some quotes from our community:

“I love git just because it allows me to commit locally (and offline).” – Guillaume Paumier

“[Y]ou can create commits locally and push them to the server later (great for working without wifi), you can tell it ‘save my work so I can go do something else now’ in one command, and it’ll allow us to review changes before they go into “trunk” (master)…. without human intervention in merging things into trunk. Gerrit automates this process.” – Roan Kattouw

And finally, to get improvements to users faster: with better branching and a more granular code review workflow that suits our needs better, plus our ongoing improvements to our automated testing infrastructure, we won’t have to wait months before deploying already-written features and bugfixes to Wikimedia sites.

We had years of discussion before we finally decided to switch, but now we can look forward to more flexibility and power in our engineering processes.

What are we doing?

We’ve now done almost all the back-end work of preparing our repository for the move and are in the final steps of preparation (details). We’ve also written explanations of the new workflow, the migration schedule, issues yet to be addressed, and other related topics. Right now, we’re asking people to stop creating any new extensions in Subversion right now, and to watch the wikitech-l mailing list for more updates.

What are the next steps?

Over the next two and a half weeks, the Git repository that contains MediaWiki core and extensions will be brought in step with Subversion, and at first it will be read-only (no one will be able to push changes). This will allow developers to start cloning it to their local machines and getting used to things.

For MediaWiki core and for extensions that the Wikimedia Foundation deploys on its wikis, the switchover is pencilled in for the weekend of March 3rd (Update as of February 29th: the new migration date is Wednesday, March 21st). We’ll do core first, and then extensions after, but hopefully all in the same weekend. After the successful migration, the Subversion repository (for the directories that have moved to Git, such as /trunk/phase3/) will be made read-only.

See the full schedule.

I develop for a Wikimedia project. Do I have to switch to Git?

Only two projects are affected immediately: the core of MediaWiki and the extensions that get deployed on Wikimedia Foundation projects.

So, if you work on an extension that the Wikimedia Foundation does not use, or on a non-MediaWiki project hosted at svn.wikimedia.org, you have more time to decide. Talk it over with your community and decide whether you would like to move to Git immediately, move to Git sometime over the next several months, or move to another hosting provider sometime before mid-2013. We would like to gradually migrate all projects currently on Wikimedia’s Subversion repository so that we can make all of svn.wikimedia.org read-only by the middle of 2013, and thus only have to support one source control infrastructure.

More details.

Will training and documentation be available? When?

Yes, we will provide training and documentation to help you use the new workflow. Check our Git page and its links now, and watch that space! There will be more documentation as well as some interactive training sessions before the big switchover in early March.

If you have any questions, please ask in #mediawiki on Freenode or on wikitech-l.  Thank you!

Chad Horohoe
Git migration lead
Platform Engineering department
Wikimedia Foundation

Sumana Harihareswara
Volunteer Development Coordinator
Platform Engineering department
Wikimedia Foundation

The #MediaWiki #hackathon in Pune, #India

When good people get together in a friendly, well organised setting like this weekend in Pune, many great things happen. Several MediaWiki developers had come to provide the many people new to MediaWiki with their expertise and guide people into its inner workings.

Many people worked on Wikimedia mobile and the SmartPhone software, others worked on MediaWiki and its extensions. Bugs got fixed and functionality got extended.

One of the surprises was two people working on the localisation for the Mongolian language. The inclusion of a web font that will support the Dzonka language is another.

Dzongkha is the official language of Bhutan and according to Ethnologue, the script used is either Tibetan script, Uchen style or the Tibetan script, Umed style. These scripts and styles are also used for the Tibetan language, it is not only Dzongkha that stands to benefit.

One of the highlights of the work on the SmartPhone app is support for scripts that are written from right to left, this is now “beta” functionality. The result of more people looking at the code was that several bugs received the attention needed to make them go away. Scrolling was one area that got attention; this results in a smoother user experience.

New input methods have been created for Punjabi transliteration and for an Gujarati input method to be included in Narayam. The continued collaboration with RedHat engineers ensures that our work benefits both MediaWiki and RedHat/Fedora. We do realise that there is still a lot to do and it is not only documentation. Additional work was done on the “visual on-screen keyboard” that was started at the previous hackathon in Pune, it still needs more testing and design work.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Techies learn, make, win at Foundation’s first San Francisco hackathon

Participants at the San Francisco hackathon in 2012

Participants at the San Francisco hackathon in January 2012

In January, 92 participants gathered in San Francisco to learn about Wikimedia technology and to build things in our first Bay Area hackathon.

After a kickoff speech by Foundation VP of Engineering Erik Möller (video), we led tutorials on the MediaWiki web API, customizing wikis with JavaScript user scripts and Gadgets, and building the Wikipedia Android app.  (We recorded each training; click those links for how-to guides and videos.)  We asked the participants to self-organize into teams and work on projects.  After their demonstration showcase, judges awarded a few prizes to the best demos.

(more…)

The end of a slushed sprint

Consolidation was the name of the game for the past sprint for Wikimedia’s Localization team. A bug triage, testing, documentation and bug fixes were the activities designed to make our software more stable and more usable. When you read the bug triage report it becomes clear how much the devil is in the details; real native language expertise is needed to understand and assess the issues  we aim to solve. Read the report and you will see how much we rely on our community, on people like Srikanth and Nemo_bis.

Now that we are writing documentation in a central place, like here on the language statistics of the Translate extension, we are now able to provide you with a help text that is specific to the context. For the language statistics it is a help text about “statistics and reporting“. This functionality is ready but will become available in the deployment of January 30. You can help us and yourself by reading and understanding the text. Ask when you have questions and you can translate the text and make the text that much more your own.

Narayam is another extension that has been improved with user documentation. This documentation is completely new and it can effectively replace existing documentation. The existing documentation has the benefit of being written in the local language and we expect that what is written will be similar to the Narayam documentation. The language communities can then decide if they want to point to the local documentation. Like all our software, the Narayam documentation will be available for translation. Having the translation ready may be one of the considerations.

A lot of work is going into the description of the many input methods like the Inscipt layout for Assamese. These descriptions are “must have” help information when you do not know a particular keyboard layout by heart. They also provide a wonderful opportunity to verify if our implementation for a particular keyboard method is correct. This is yet another instance where native speakers can help us a lot.

Testing and coming to grips with the different tools was a major goal for this sprint. PHPunit and Qunit is what is used to test PHP and JavaScript and the tests developed are used in an environment called TestSwarm and Jenkins (respectively for PHP and JavaScript). As our team is so much into language support, we are learning what the limits are for testing for different languages and scripts.

All in all there may have been a slush and we have done a lot of code review, but we also managed to make sure that our functionality has gained stability for this and future releases. Additionally, work was done on grammar support for JavaScript, but the patch for that was stuffed in a bug report because of the slush, as the story was moved to the next sprint. Grammar support is what fills the gap in localization support between JavaScript and PHP and makes it available to any and all other developers.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

 

Sprinting ahead when there is a “slush”

When there is a code freeze or a slush, the potential for what is to be delivered is curtailed. It is official; you will not deliver new code, you will work towards consolidation of the new MediaWiki release.

One of the objectives for this and the next release is that the time between releases will decrease. Even though the Localization team works in two week sprints, it can help with getting the release out of the door. The first thing to do is help even more with code review, the other thing is make sure that its code will be optimised for easy coding, testing and use.

When you check out mingle, (user guest, password guest), you will find that the developers of our team are learning about the various testing tools. They are even updating the developer documentation to make it easier to understand how to set up new automated tests.

When you are testing, it is necessary that code provides information about its execution. This realization means that the code needs to be refactored in order to allow for testing. Documentation is another part of the puzzle that helps stabilise code; you will find a prodigious amount of documentation that is scheduled for this sprint.

All this translates in quite a minimal deployment for the first week. Its highlights are:

Translate:

  • Better error checking and handling in Special:Translate
  • Translatable page id prefix changed from page| to page-
  • Don’t reuse messages from core

WebFonts:

  • Fixed download of Vemana Telugu font
  • Added font for Ahirani (ahr)

Narayam: Some fixes to Assamese transliteration rules

Core: the cropping of text in level 1 headers is fixed for Indic languages

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Addressing the many

When you have a message, you use the appropriate language and tools to address multiple people. We do not use our eyes to see how many people we address and we do not use a bull horn to be heard. Our MediaWiki software knows the numbers involved and a plural enabled message will be formed according to the rules of the language.

When we implemented plural support for JavaScript, we checked our new implementation for plural with our implementation in PHP and we checked against the standard for such things, the CLDR.

The Localisation team does not know the language rules for the 280+ languages that have a Wikipedia. We prefer to implement what the standard tells us but we support more languages than the CLDR. We want to channel our need for support through “Language Support teams” and we want them to help us understand  and fix the inconsistencies and add the missing information to the CLDR.

Inconsistencies with the CLDR
  • Belarusian – ‘other’ form missing in MediaWiki
  • Belarusian-tarask – ‘other’ form missing in MediaWiki
  • Bosnian – ‘other’ form missing in MediaWiki
  • Manx - CLDR has 3 , MediaWiki has 4 forms
  • Hebrew – CLDR has 2, MediaWiki has 3 forms
  • Croatian – ‘other’ form missing in MediaWiki
  • Ripoarian / Colonian – order of forms different. CLDR says 0,1, other. MediaWiki says 1,other,zero
  • Latvian – CLDR defines zero, one , other forms. MediaWiki has only two forms, one for (1, 21, 31, 41, 51, 61…) and another for rest of the forms.
  • Macedonian – CLDR defines forms[0] for n!=11. MediaWiki defines forms[0] for n%100!=11
  • Polish: ‘other’ form is not defined in MediaWiki.
  • Russian : CLDR defines 4 plural forms. Form with decimals missing.
  • Slovenian – MediaWiki defines a zero form which is not present in CLDR
missing in CLDR
  • Church Slavonic
  • Lower Sorbian
  • Scottisch Gaelic
  • Upper Sorbian

Please make a difference for the support for your language and join the Language support team.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant