Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘India’

Wiki women joining Indic languages

Netha Hussain

User:Netha Hussain‘s inspiring story is a wonderful way of celebrating Women’s History Month. Netha is a woman editor of the Malayalam language Wikipedia from the state of Kerala in India.

Netha is both a medical student and a Wikipedian. She mostly edits articles related to medicine/biology, literature and women’s biographies. She used to maintain a portal for biology on the Malayalam Wikipedia and is presently working to create and improve its most important health articles.

Netha recalls how she landed up on Wikipedia searching for a kind of chutney made in Malayali cuisine, ‘Chammandi‘, and after realizing there was no article on it, started it herself. Initially reluctant to edit in Malayalam, it is actually through Wikipedia that she brushed up her language skills well enough to write a Featured Article in Malayalam within a year! On the English Wikpedia, she started by editing the article about her college.

As it is so often the case, until Netha landed up at a WikiAcademy in Kozhikode, not many knew she was a female editor! She has taken up the challenge to bridge this gender gap and now runs mailing list discussions for women Wikimedians in Malayalam to share their experiences and build offline relationships. “Most of my friends online are Wikipedians”, she quips.

About welcoming women editors on Wikipedia, especially its Indic language versions, she says, “The community is very receptive to women editors. I was not privileged or discriminated just because I was a woman. I was encouraged to work on women’s biographies which were mostly stubs. With my help, many good quality articles on famous women were created on Malayalam Wikipedia.”

As in Netha’s case, in most Indic language Wikipedias it is easier to make substantial contributions than in other projects. Netha believes that the role of women is not different from the role from men in their contribution to free knowledge movements.

Netha believes her medical dreams and her Wikipedia editing reflect aligned missions “to empower people with knowledge and fulfill our duties towards the society.” (To reach out to her, the best place is her talk page.)

Noopur Raval, Consultant (Communications), India Program, Wikimedia Foundation

The end of the tenth sprint

Every two weeks a development sprint is finished. Every two weeks we evaluate what we achieved, what went well and what went wrong. Many of the stories of sprint 10 can be found in Mingle (user:guest, password:guest). There you see the stories that were accepted or postponed.

The stories that ended happily are all over the place.

  • The Ahirani language, a language of India that uses the Devanagari script in the same way as Marathi, is now supported for web fonts and input methods.
  • When a translation administrator encourages or discourages the translation of a text, this will now be logged. This helps translators prioritize their activities.
  • WebFonts now uses the MicroType Express font compression technology. This makes sending fonts to your browser go much faster.
  • A translator can inform how he wants to be contacted and how often he can be contacted. In true agile fashion, the software that will make use of this will be written in a future sprint
  • Some texts only need to be translated in selected languages because they will reach a specific public or because it will be used in software that supports a limited number of languages. New functionality enables a translation administrator to select these languages.
  • We did a lot of code review; it gets done as it is part of our plan

A few stories did not end on a high note:

  • Configuring one translation memory for all the wikis where the WMF needs translation took much longer. The idea was to build it first on Labs. This idea has now been shelved and it will be configured directly in production.
  • A lot of work has gone in EasyTimeline. This was to make its functionality usable in other scripts and languages that are written from right to left. It works after a fashion and many issues have been resolved. Sadly the devil is in the details. Ploticus is a dependency for EasyTimeline and it has a bugs in creating  SVG output. There is no plan to fix this bug in Ploticus ourselves, but we are trying to find developers who can. Until then, we cannot have progress on this feature. Please let us know if you are interesting this issue for us.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Fonts and their use in source texts

When a text is written, when it is printed for a first time, it will have a contemporary look. When you then look at historic texts, at a first publication, you will notice the many details that show its age. It can be in differences in orthography, differences in vocabulary and also differences in the layout, the fonts used.

When sources are published in Wikisource, maintaining the atmosphere of the original text is very important. It is why the original orthography and vocabulary are maintained and with the availability of the  WebFonts extension there is a potential to use fonts that give this impression of age.

In the Office hours of the Localisation team, the question was raised if we could support cuneiform. The answer to that was that we can when there is a freely licensed font. We found a freely licensed cuneiform font and it is made available on the Wikis that support WebFonts. The bigger question however is about all the other scripts that are of  historic significance. This is of particular relevance to the Sanskrit Wikisource; the Sanskrit language is written in many scripts and it is only recent when the Devanagari script became the default script.

For sources like the Quran maintaining the original orthography and characters is an article of faith. It is for this reason that characters were added to Unicode because alternate representations of the same characters were missing. We do have a beautiful freely license font, the Amiri font and we would love to support it in MediaWiki but we are struggling with technical issues.

For the Wikimedia Localisation team, it is impossible to identify all the needs for fonts, for historic text representation. This is why we have language support teams. They know their language, they can identify a need and hopefully they can identify usable freely licensed fonts. When they do, we can and will support fonts. In the mean time we will continue our work on a unified language selector.  This will make the use of WebFonts easy and obvious. At this time it works, but it is hard work for you as a user.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

The #MediaWiki #hackathon in Pune, #India

When good people get together in a friendly, well organised setting like this weekend in Pune, many great things happen. Several MediaWiki developers had come to provide the many people new to MediaWiki with their expertise and guide people into its inner workings.

Many people worked on Wikimedia mobile and the SmartPhone software, others worked on MediaWiki and its extensions. Bugs got fixed and functionality got extended.

One of the surprises was two people working on the localisation for the Mongolian language. The inclusion of a web font that will support the Dzonka language is another.

Dzongkha is the official language of Bhutan and according to Ethnologue, the script used is either Tibetan script, Uchen style or the Tibetan script, Umed style. These scripts and styles are also used for the Tibetan language, it is not only Dzongkha that stands to benefit.

One of the highlights of the work on the SmartPhone app is support for scripts that are written from right to left, this is now “beta” functionality. The result of more people looking at the code was that several bugs received the attention needed to make them go away. Scrolling was one area that got attention; this results in a smoother user experience.

New input methods have been created for Punjabi transliteration and for an Gujarati input method to be included in Narayam. The continued collaboration with RedHat engineers ensures that our work benefits both MediaWiki and RedHat/Fedora. We do realise that there is still a lot to do and it is not only documentation. Additional work was done on the “visual on-screen keyboard” that was started at the previous hackathon in Pune, it still needs more testing and design work.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

Insights from mobile user experience research

Mobile Wikipedia readers in Brazil

As part of our commitment to provide free knowledge to everyone, the foundation has been redesigning our mobile platform (m.wikipedia.org and mobile.wikipedia.org) to enhance the reading experience and allow editing.  As a first step towards the redesign of the mobile gateway to better meet the needs of our users in the Global South, we conducted user experience research in India and Brazil among current and future users of Wikipedia mobile last summer.  We also carried out user experience research in the US to have a comparison with a mobile market which is more mature in terms of smartphone and 3G penetration, and has a more widespread adoption of tablets.

Our research in India and Brazil brought forth the following three opportunities with the greatest perceived impact for the mobile platform:

  1. Improving our search:  Our research revealed that there was a need to provide search suggestions, autocomplete, autocorrect and other tools that ease typing and search burdens on mobile devices; support search in all language Wikipedias as well as allowing users to chose and switch between languages; incorporate transliteration tools for languages with fonts and characters that have poor mobile support; support and even enhance users’ existing habits to use Google to reach Wikipedia articles; and enable users to search within a Wikipedia page. We are happy to report that drawing from the research our mobile team has already implemented some of these opportunities like full page search, autocomplete  and inter-wiki links into our mobile beta site.
  2. Optimizing our reading experience for mobile devices and generalized use.  Through our research, especially in India, we found that we were not redirecting a large breadth of devices in use to our mobile site. The mobile team quickly fixed this issue with the adoption of the open source library tera-WURFL for detecting mobile devices.  After speaking with respondents in India and Brazil, we found that there was a desire among users to modify or set one-time preferences for the display of images, the font size, and any element that affects page loading time and size. Similarly, there is an opportunity for allowing  preferences for language and navigation; the ability to watch or bookmark articles; or save content offline; offer content in more digestible pieces, or with quicker access (i.e. preview or easy access to the first paragraph, or a new “mobile summary”); search offline, i.e., while in transit or without a data plan; and generally follow expectations set by mobile web interactions and standards.  Some of these recommendations have been incorporated into our mobile product strategy.  Through this research we felt it was crucial to offer both an official iOS and Android app (which was officially released in January) that offers at minimum a simple and easy search and reading experience.
  3. Using the mobile platform to both increase user engagement and awareness of features on Wikipedia as well as providing new opportunities for participation. The mobile site and potential apps provide many new pathways for both engagement, participation, and contribution.  At present, the mobile site can be used to build awareness around existing features on the site that current users are blind to (i.e. watchlists, accounts, editing, inter-language links, history); to provide features that make opening a Wikipedia account worth having, something that the majority of our participants do not currently see any reason to have; increase visibility of local language Wikipedias, especially in India since many English readers were not aware of the existence of Indic Wikipedias; prompt users to download an official app when possible; and interface with other web content on mobile devices (Google, news, entertainment, and sports content, for example “Wikitap”).  The contributions that showed the highest potential for adoption were adding photographs, “flagging” or “marking” something that needs to be edited, removing or marking vandalism, adding links, adding location or geodata, and potentially making small typing or formatting edits.
  4. Mobile Editing. And finally, the mobile site can support the editing practice of existing editors by first offering those features in a mobile friendly format which are currently in high use on the site.  Those with the highest demand and potential are the “recent changes” page, which is consumed like an update feed or email; accessing watch lists; making reverts, especially with respect to vandalism; logging in and accessing account and user pages; and serving discussion pages and article histories.

 

If you are interested in reading about our research in India and Brazil in detail, we have compiled the insights in a report which is available in PDF and wiki format. You can also watch video highlights from the interviews and check out some photographs from the field work in India and Brazil.

Mani Pande, Head of Global Development Research

Indian Language Wikipedia Statistics – October 2011

Here are the statistics of Indic language Wikipedias for the month of October 2011. The data for this report is taken from http://stats.wikimedia.org/

I have restructured my report to make it shorter and easier to read and compare – but without losing any of the data points. I have divided it into Quality of Projects, Community Building, and Readership.

NOTE: I have used the Indian way way of denoting large numbers: Crore is equal to 10 million, and Lakh is 100,000.

 

Community

In the table below are new users who have edited at least 10 times, existing editors with at least 5 edits in that month, and existing editors with more than 100 edits in that month. Once again, it is essential to look at all three numbers in connjuction with each other.

Something that I have been reflecting on is how even in relatively small communities (which is what almost all Indic communities are) there is still a relatively low number of new users coming on board and a very tiny number of editors have edited more than 100 times. The former is self-evident as a problem because it means we need to do so much more to encourage new editors. The latter is worrying because it means we also need to do much more to encourage editor retention as well as editor motivation.

Malayalam and Tamil have the healthiest position on this table – across all three parameters and looking at progress month-on-month. This is most probably because of the strong efforts at community building in both communities. It is really important that these communities continue to build on their strong foundations.

I am particularly excited about two languages in this list. Both Marathi and Bengali editor counts have increased across all parameters and that is very encouraging. They are large languages with massive potential. I am also really hopeful that the Marathi media coverage around last month’s WikiConference is going to support the community as they go about encouraging and supporting new and existing editors.

Overall, though, it must be said that the total number of new editors coming to new Indic wikipedias is low. So focus need to be on bringing new editors to wiki and retaining existing users.

Quality of Projects

(more…)

Ready for the WebFonts launch

After months of preparation, demonstrating the latest versions in person and on-line, going through tons of feedback and implement resulting modifications, we are ready for the launch of Webfonts. Web fonts is a technology that ensures us that the readers of our wikis will always see the intended characters on their screen. Many devices do not provide the necessary fonts that allow people to read their mother tongue.

When people do not even see what we aim to provide to them, we fail. According to the Wikipedia article, web fonts are considered “controversial” because the licenses of many fonts prevent them from being used as web fonts. There is no such controversy when freely licensed fonts are used and we are really happy with our collaboration with the producers of such fonts.  We learned that fonts working on one platform do not necessarily work as well on another platform / operating system.

Enabling people to read and enabling people to write their language is at this time our prime objective and, when people are happy when they find they can as they did at the localisation sprint in Pune. Being able to type Marathi or Punjabi, Hindi or Tamil on a thin client put a smile on many faces. They used the latest software at translatewiki.net and  the feedback we got from them and others has resulted in many technical and usability improvements.

The launch of WebFonts together with the Narayam improvements on Monday 12 December represents significant progress in helping enable Indic language contributions to our projects; it consists of a large amount of code, it will be implemented on a selected range of wikis and it affects many communities. It will affect them and the Wikis in Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Tamil and Telegu.  All these communities have been involved it testing the evolving functionality at translatewiki and the comments and bug reports we received were essential for what we are now proud to present. With the launch more people will experience the WebFonts technology for the first time. We are eager to improve on what we have because we believe that the web fonts technology is crucial for the emancipation of many languages and scripts in this digital age..

Thanks,

Gerard Meijssen

Internationalization / Localization outreach consultant

WikiConference India


WikiConference India 2011 started out for me in a dingy garage at the University of Mumbai – with 3 perturbed University staff, 2 stray dogs, 1 vehicle sorely in need of a paint job and a very good-spirited Jimmy Wales for company! I suppose one always knew that India was unpredictable – but even by those standards, this was a bizarre way for the festivities to commence!

The reason we were sat in the garage was because of a last minute hiccup which meant we couldn’t enter the actual conference hall for 15 minutes until after the event started. I’m pleased to report that all my companions from the garage are doing alright. Jimmy was released from my forced company and made a fine opening speech shortly after.

WikiConference2011 was the first national level meeting of the Wikimedia India community and was held between November 17th and 20th, 2011 in Mumbai.  The gorgeous Convocation Hall was the venue for nearly 700 people – making it one of the largest Wikimedia events ever!  I suppose it was to be expected given that this was in India – but the logistical challenges of such a large event were quite the handful – especially for a small community.

Cutely, as with all things Wikimedia, it started off with an innocuous Meta page created by a member of the Mumbai community with some placeholders for a potential Wikimania bid. Much water – and a few other fluids – subsequently flowed and the India community cajoled and organized and galvanized itself to pull off this event.  It involved – as all such events tend to – tireless work over many months by so many volunteers across the country (and indeed the world.)

My personal take on the event was that it was an encouraging start. Notwithstanding all the chaos and conflict and angst in the leadup to the event (and indeep some which spilled over at and beyond the event) – and the intermittent wi-fi and dodgy food and hip hop music at the party – I was pleased with WikiConference.

This was the first time that the India community has worked together at a national level. The value of this exposure is going to be extremely powerful to build a strong and cohesive community – and the power of this experience to build and improve the quality of projects is going to be invaluable.

The quality of the discourse was learned and informed and measured. More than 50 sessions had insights on community building, improving project quality, establishing partnerships, best practices in outreach and just good old fashioned sharing of experiences.

While there were some really interesting sessions, the hang outs outside were fantastic for folks to get to know each other, put faces to names, share ideas and to run off for a cup of sugar cane juice across the road. I’m sure these connections will grow into stronger collaboration going forward.

Personally, my high point was the penultimate session where recognition was given to notable Wikimedians from across the country. Interestingly, the majority of the folks who got this acknowledgement had not attended – but community members stepped in and each said very touching words about the recipients and gave very human stories about how they had helped on various projects.

Events such as this serve to energise communities – and I’m really inspired to see that a bunch of community members have started doing a bunch of things after Mumbai. These include proposals for specific community development work to participation in outreach sessions or improvement of articles related to India. The real measure of such events is often only apparent after the actual event.

I really hope that this event becomes an annual affair – and that India puts up a Wikimania bid at some point in time. The organizational experience from this conference (and hopefully future events) will prove to be a powerful advantage for such a bid.

I hope – some day –  to see you folks in India to attend a Wikimania.

Hisham Mundol
Consultant, India Programs

Supporting the languages of India

India is different. Given that India is very strategic for the Wikimedia Foundation, the question is what can we do to raise the profile of our projects and what can we do to support the Indic language effectively.

Many well educated people, people with a university level education are effectively illiterate in their own language. For them a Wikipedia in their own language does not tempt them to get involved. They do not have the skills even though it would not be that hard for them to learn to read and write their mother tongue. What really helps is that writing the Indic languages is helped in two ways; the scripts are really phonetic and InScript, the dominant keyboard layout for Indic languages, ensures that the same sound is always in the same place.

When our goal is to get more people involved in the Indic languages, we can ask people to transcribe the scans of public domain books. We will be providing them with a keyboard mapping, the fonts that show their language. As these “illiterates” recognise the characters and reproduce them digitally, they learn not only to type their language they may even learn to read. When we recognise their effort in a thank you note accompanying the book, experience teaches us they are likely to help us in future projects.

The project that is already making a big impact in India in this way is the Malayalam Wikisource project.They published a CD with a years worth of sources and distributed it to the schools of Kerala. They produce software that ensures that the content looks really good. The software as well as the content is available on the internet but sadly this full experience can not be had on Wikisource itself.

When a new book becomes available, the Malayalam press mentions this often in their periodicals so much so that Wikisource is mentioned more often in the press than Wikipedia.

 

 

Similar projects for other Indic languages have been a popular topic at the WikiConference India; it was discussed at least for Sanskrit and Tamil. The discussion was not only about the organisation of such a project but also about internationalising the software that prepares the final product and about using Kiwix for presenting it. When you consider how much literature is available in the Indic languages that is already in the public domain, this is a project that will run and run.

Preparing sources in Wikibooks or Wikisource in a collaborative way makes sense in a Wiki. Once the work is done however publishing the content can be in all kinds of formats. This is important because we do want it to be read as widely as possible because this is how we optimally realise our objectives.

Jimmy is right when he said in his speech that the Indic language communities can learn from each other and do really well. However these best practices can be applied to any Wikisource or Wikibooks.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant

The Mumbai hackathon was sweet

When a hackathon is organised, it is wonderful when the reality of the results exceeds expectations. The reality was that some of India’s best and brightest attended the hackathon. They represented many of the languages  of India, and it showed.

Seven Indians and a German created an input method for their language. A Russian keyboard method is promised for the next day. There was a jQuery wizard who created a wonderful and necessary addition to the Narayam extension: a visual cue to where the characters are on the keyboard. This information comes directly from the Narayam definitions and the best part is that the visual cue actually works as well.

The WebFonts extension got its reality check. WebFonts provides default fonts in order to ensure that nobody sees the infamous Unicode squares and numbers instead of the desired characters. The MediaWiki software is exclusively open source, and consequently the fonts we deliver through the WebFonts extension need to be freely licensed, too.  The default font we use for the Indic languages is the Lohit font produced by Red Hat. It was quite astonishing to learn that some of the characters are not what the character should look like. Bugs have been filed for this at Red Hat and more work will be done.

We are going to roll out the WebFonts extension on December 12th. Our aim is to install it on the Indic projects. When we have freely licensed fonts that show languages correctly, we will finally be able to provide readable content to everyone. We will be working towards resolving the issues identified at the hackathon.

The Mumbai hackathon has also been good for the Kiwix off-line reader; not only was the software localised into several languages, new developers also familiarized themselves with the software itself to implement further improvements. This is quite important because many Indian people have no or intermittent access to the Internet. In addition to Wikipedia content, there are many projects in India to transcribe books that are in the public domain; as the Kiwix software gets ready to support this content, it will help more and more people get access to India’s rich cultural heritage.

Mobile support was the third centre of gravity; many first-time Wikimedia hackers teamed up with seasoned Wikimedia developers and this produced great results. This included work on a mobile landing page for India, as well as a gateway that allows users to receive Wikipedia articles over SMS and the carrier-specific USSD technology. To appreciate this, many people do not have access to the Internet and consequently to our content. Work also continued on the “Wikipedia Zero” project, which aims to bring Wikipedia and other Wikimedia content to millions of users without data charges.

We also saw an interesting connection with the October 2011 Coding Challenge. Developer Yuvipanda implemented Android 2.2 support for one of the coding challenge submissions, the “Share with Wikimedia Commons” Android app (as well as for the official Wikipedia Android app).

All this will get some review, maybe some polishing but we are quite eager to bring this functionality to you.

Many of the hackers were new to MediaWiki. With an introduction by Erik and private tutoring by Sumana, Tomasz, Patrick, and others, several people really got into the swing of things to the extent that some bugs were smashed.  The hackathon proved as always that when you bring great people together special things can and do happen.

Thanks,
Gerard Meijssen
Internationalization / Localization outreach consultant