Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘India’

The First ever Creative Commons event in Telugu: Ten Telugu Books Re-released Under CC

Event flyer, User:రహ్మానుద్దీన్, CC-BY-SA 3.0

Telugu is one of the 22 scheduled languages of the Republic of India (Bhārat Gaṇarājya) and is the official language of the Indian states of Andhra Pradesh, Telangana and the Union Territory district of Yanam. In India alone Telugu is spoken by 100 million people and is estimated to have 180 million speakers around the world. The government of India declared Telugu a Classical language in 2008.

Telugu Wikipedia has been in existence for more than 10 years and has 57,000 articles. Telugu Wikisource is one of the sister projects that has more than 9,400 pages. Several Telugu books are being typed and proofread using Proofread extension. Since Telugu is one of the complex Indic scripts, computing in Telugu came much later. Many books that were published (or are being published) are not in Unicode. Telugu Wikisource has now emerged as the largest searchable online book repository in Telugu. Telugu Wikisourcerers, despite being a small community, did a great job of digitizing many prominent Telugu literary works. Attempts have been made to convince contemporary writers to re-release their books in CC-BY-SA 3.0 license. Such an effort was made a year ago by bringing in a translation of the Quran in Telugu. Recently, 10 Telugu books by a single author were re-released under the Creative Commons license (CC-BY-SA 3.0) on June 22, 2014 at The Golden Threshold, an off-campus annex of the University of HyderabadCIS-A2K played an instrumental role in getting this content donated. This is one of the first instances in an Indian languages where a single author re-released such a large collection of books under the CC license. These books are being uploaded on Telugu Wikisource using Unicode converters.

(more…)

Wikipedia for Schools Project

Teachers and students in Nyeri, Kenya listening to a tutorial under the Wikipedia for Schools Project.

In 2005, SOS Children, the world’s largest charity for orphan and abandoned children[1] [2] launched the “A World of Learning” project, which handpicks Wikipedia articles and categorizes them by subject for schoolchildren around the world to use. The project focuses on content that is suitable for students between the ages of 8-17 based on the UK education curriculum. In November 2006, the Wikimedia Foundation (WMF) endorsed the project, which resulted in its relaunch as “Wikipedia for Schools” and its new web address (http://www.schools-wikipedia.org). Ever since then, the project continues to enjoy the support of The Wikimedia Foundation. The website went through subsequent revisions in 2008-09 as well as in 2013. The 2013 edition has 6,000 articles, 26 million words and 50,000 images – making it a fairly large project that caters to the needs of school children across the globe. The online Website also contains “download the website” link which enables users to download the material for use without internet connection. [1]

Hole in the Wall Education Ltd (HiWEL) supported the Wikipedia for Schools project in an effort to expand its reach in Learning Stations in India and African countries.[2] The program has received recognition from around the world for its far reaching impact. According to Subir from Nepal’s online learning project E-Pustakalaya, “Wikipedia for Schools has been really useful in public schools in Nepal. The students of remote corners of Nepal, where there is no internet access, now know about the diverse culture, religion, art, science and lifestyles of the countries around the world. All credit goes to the team that built this wonderful repository of information for schools.” [1] Similarly, Patrick of Treverton Schools, South Africa, welcomed the effort as “fantastic resource for schools with little or no bandwidth, of which there are many here in South Africa.”

(more…)

Samskrita Bharati and Sanskrit Wikipedia: The journey ahead

“Aksharam,” Samskrita Bharati Office in Bangalore.

In 1981, a movement called the “Speak Samskrit Movement” started in Bangalore. The effort quickly spread across India and evolved into the organization “Samskrita Bharati” in 1995. The movement has a number of dedicated volunteers who aim to popularize the Sanskrit language, Sanskrit culture and the Knowledge Tradition of India.[1]

Inline with these objectives, Samskrita Bharati embarked upon a mission to enrich Sanskrit Wikipedia in 2011. This project involved approximately 50 volunteers with some of them working full-time. Most of the contributors are based in Bangalore, Karnataka or Karnavati, Gujarat. As a result of tremendous effort and dedication, the team was able to substantially grow the number of articles on Sanskrit Wikipedia. With only 2,000 articles in 2011, mostly written in Hindi, the present number of articles is well over 10,000, with articles ranging from geography and history to health and society.

In terms of editing difficulties, Samskrita Bharati editors, like other Sanskrit Wikipedians, encountered difficulty in the use of modern terminology and the paucity of referenceable literature. Most of the contributors of Sanskrit Wikipedia are from the Southern region, resulting in confusion due to pronunciation differences between northern and southern regions for some Sanskrit words.

As part of the outreach efforts, Samskrita Bharati conducted introductory workshops in many educational institutions like Karnataka Samskrit University, Delhi University and Christ University, Bangalore.

(more…)

Odisha Dibasa 2014: 14 books re-released under CC license

This post is available in 2 languages:
English  • Odia

 Guests releasing a kit DVD containing Odia typeface “Odia OT Jagannatha,” offline input tool “TypeOdia,” Odia language dictionaries, open source softwares, offline Odia Wikipedia and Ubuntu package.

Odisha became a separate state in British India on April 1, 1936. Odia, a 2,500 year old language recently gained the status of an Indian classical language. The Odia Wikimedia community celebrated these two occasions on March 29 in Bhubaneswar with a gathering of 70 people. Linguists, scholars and journalists discussed the state of the Odia language in the digital era, initiatives for its development and steps that can be taken to increase accessibility to books and other educational resources. 14 copyrighted books have been re-licensed under the Creative Commons license and the digitization project on Odia WikiSource was formally initiated by an indigenous educational institute, the Kalinga Institute of Social Sciences (KISS). Professor Udayanath Sahu from Utkal University, The Odisha Review’s editor Dr. Lenin Mohanty, Odisha Bhaskar’s editor Pradosh Pattnaik, Odia language researcher Subrat Prusty, Dr. Madan Mohan Sahu, Allhadmohini Mohanty, Chairman Manik-Biswanath Smrutinyasa and trust’s secretary Brajamohan Patnaik along with senior members Sarojkanta Choudhury and Shisira Ranjan Dash spoke at the event.

 Group photo of Odia wikimedians participating in the advanced Wikimedia workshop at KIIT University.

Eleven books from Odia writer Dr. Jagannath Mohanty were re-released under Creative Commons Share-Alike (CC-BY-SA 3.0) license by the “Manik-Biswanath Smrutinyasa” trust,  a trust founded by Dr. Mohanty for the development of the Odia language. Allhadmohini Mohanty formally gave written permission to Odia Wikimedia to release and digitize these books.

The community will be training students and a group of six faculty members at KISS who will coordinate the digitization of these books. “Collaborative efforts and open access to knowledge repositories will enrich our language and culture,” said linguist Padmashree Dr. Debiprasanna Pattanayak as he inagurated the event. Dr. Pattanayak and Odia language researcher Subrat Prusty from the Institute of Odia Studies and Research also re-licensed three books (Two Odia books; “Bhasa O Jatiyata“, “Jati, Jagruti O Pragati” and an English book “Classical Odia”) based on their research on Odia language and cultural influence of the language on other societies under the same license. KISS is going to digitize some of these books and make them available on Odia Wikisource.

(more…)

Celebrating Women’s Day, the Wiki way

Participants editing articles about women in science.

How many Indian women scientists can you name? Go on! Think about this one. Think really hard. How many can you name, now? One? Two? Three?

I wrote this blog post at a co-working space for tech startups in the Southern Indian city of Kochi. I was surrounded by science students. None of them could think of a single woman scientist from India. Pretty shameful, isn’t it? And, there was nobody to burst our sexist bubble, except, Wikipedia. This page lists 15 women scientists from India. While I am grateful for this archive, it is hardly comprehensive. 15 women scientists from a country of 1.2 billion people.

India is currently Asia’s third largest economy and it prides itself on making many ancient discoveries. Given this context, it is unbefitting for us to come up with such a tiny list. (By the way, If you know of a more detailed website on this subject, please send me the link on Twitter – which you can find at the bottom of this page). Could there be women whose contribution to science have slipped out of popular culture?

Wikipedia has organized edit-a-thons for the entire month of March to address these glaring gaps in our knowledge. The goal of these edit-a-thons is to celebrate International’s Women’s Day that fell on March 8. During this month, we would like to enhance the quantity and quality of Wikipedia articles on gender and sexuality and translate English articles into other Indic languages. Anyone can join the celebrations as editors, translators, bloggers, event managers or enthusiasts.

We encourage more South Asian women to use this opportunity- right now 9 out of 10 Wikipedians are men. There are many subjects that may be of interest or value to women that are not covered in traditional encyclopaedias because the majority of knowledge-producers are men. Let us make sure that Wikipedia is diverse and voices from all sections of  society are represented.

We have kick-started the event with weekend edit-a-thons. We will provide specific topics and links to editors to write or expand upon. This month the focus is on women parliamentarians and scientists.

So come on over, put your editing skills to use, make some new friends and last but not the least, learn more about women scientist from India!

- Diksha Madhok, Wikipedian

Odia Wikipedia: Three years of active contributions gives life to a ten year old project

This post is available in 2 languages: English 7% • Odia 100%

English

Group photo of Odia Wikipedia 10 day celebration at KISS, Bhubaneswar

Odia Wikipedia celebrated its 10th anniversary on January 29th, 2014. Odia is a language spoken by roughly 33 million people in Eastern India, and is one of the many official languages of India. Odia Wikipedia started as one of the first Indic language Wikipedias. In 2011 there were only 550 articles with practically no contributors. The initial Wikipedians struggled to reach out to more people. Luckily, with more people coming on the Internet – primarily on social media platforms, collaboration became easier. Odia Wikipedia’s facebook page and group became the social gateway for more people to get used to working within the Odia language. This is one of the languages which has very little online presence when it comes to having content as Unicode text. Many people still struggle with the outdated pirated operating systems installed in their computers which added more hurdles in the way of all the community led Wikipedia outreach programs. There has been more developments in  recent days in language input and online contribution in Odia. More people started searching for online content using Odia in Unicode. This is where Odia Wikipedia played a crucial role in promoting a massive growth in content which is reflected in the readership. Monthly page views which remained consistently low over the years started growing from less than 1000 to more than 400,000 and at times hitting the 500,000 mark. This is the highest among all the websites that have Odia content. With a variety of new projects and more contributors than ever, Wikipedia Odia happily celebrated its its 10th anniversary over two days. Odia Wikipedians gathered in two different educational institutes. Kalinga Institute of Social Sciences in Bhubaneswar on the 28th of January and Indian Institute of Mass Communication in Dhenkanal on the 29th.

Day 1:

Debiprasanna Patnaik introducing himself for Voice intro project

First day of Odia Wikipedia 10 began with the traditional Chhena poda cutting by noted linguist Padmashree Dr. Debiprasanna Patnaik. Kalinga Institute of Social Sciences (KISS) has recently collaborated with The Centre for Internet and Society for the resource gathering, documentation and archival of 62 tribal communities of Odisha and neighboring eastern Indian states and initiating Wikipedia projects in the indigenous tribal languages. The first few phase of the workshop brought about 15 students pursuing their masters in Arts, Science and Commerce disciplines and 10 faculty members.
(more…)

WikiSangamotsavam-2013 brought Wikimedians from all over India together

Mr. Sashi Kumar‘s speech during the inaugural session of WikiSangamotsavam-2013

WikiSangamotsavam, the annual conference of Malayalam Wikimedians, took place in Alappuzha, Kerala, India from December 21-23, 2013. The conference, supported by the Wikimedia Foundation Grants program, Wikimedia India Chapter and CIS-A2K program brought together around 200 Wikimedians and well-wishers from all over India.

The host town, Alappuzha, is popularly known as the Venice of the East due to its picturesque backwaters and canals. Alappuzha was chosen as the location for the conference in an effort to bring attention to the regions diversity and touristic appeal to Wikimedians, and thereby increasing the towns representation on Wikipedia. The Board meeting of the Wikimedia India Chapter took place in conjunction with the event. A range of pre-conference events, including a bicycle rally, a meetup for young Wikimedians and several edit-a-thons took place prior to the event.

Day 1

The first day of WikiSangamotsavam started with Wiki-Vidyarthi-Sangamam, a meetup of student Wikimedians. The digitization of ‘Sri-Mahabharatham,’ a seven volume Malayalam epic was flagged off during the Wiki-Vidyarthi-Sangamam. Around 100 students from all over Kerala got to interact with each other and learn about Wikimedia projects in Malayalam. There was a Wikipedia workshop for impaired delegates. They were introduced to various means of accessibility by the DAISY Consortium. This session helped them learn about self-educating tools and accessing knowledge platforms like Wikipedia and contributing in Malayalam online.

There was a panel discussion on ‘Malayalam and Wikipedia’ during which language and computing experts discussed the role of Wikipedia in the growth of Malayalam language. Talks and presentations about topics relevant to Wikimedia were held in three parallel sessions.

The first day of the event also marked Malayalam Wikipedia’s 11th birthday. The special occasion was celebrated by cutting a birthday cake. At the end of the day, Wikimedians entertained themselves by singing folk songs of Kerala.

(more…)

Language Engineering Events – Language Summit, Fall 2013

The Wikimedia Language Engineering team, along with Red Hat, organised the Fall edition of the Open Source Language Summit in Pune, India on November 18 and 19, 2013.

Members from the Language Engineering, Mobile, VisualEditor, and Design teams of the Wikimedia Foundation joined participants from Red Hat, Google, Adobe, Microsoft Research, Indic language projects, Open Source Projects (Fedora, Debian) and Wikipedians from various Indian languages. Google Summer of Code interns for Wikimedia Language projects were also present. The 2-day event was organised as work-sessions, focussed on fonts, input tools, content translation and language support on desktop, web and mobile platforms.

Participants at the Open Source Language Summit, Pune India

The Fontbook project, started during the Language Summit earlier this year, was marked to be extended to 8 more Indian languages. The project aims to create a technical specification for Indic fonts based upon the Open Type v 1.6 specifications. Pravin Satpute and Sneha Kore of Red Hat presented their work for the next version of the Lohit font-family based upon the same specification, using Harfbuzz-ng. It is expected that this effort will complement the expected accomplishment of the Fontbook project.

The other font sessions included a walkthrough of the Autonym font created by Santhosh Thottingal, a Q&A session by Behdad Esfahbod about the state of Indic font rendering through Harfbuzz-ng, and a session to package webfonts for Debian and Fedora for native support. Learn more about the font sessions.

Improving the input tools for multilingual input on the VisualEditor was extensively discussed. David Chan walked through the event logger system built for capturing IME input events, which is being used as an automated IME testing framework available at http://tinyurl.com/imelog to build a library of similar events across IMEs, OSs and languages.

Santhosh Thottingal stepped through several tough use cases of handling multilingual input, to support the VisualEditor’s inherent need to provide non-native support for handling language content blocks within the contentEditable surface. Wikipedians from various Indic languages also provided their inputs. On-screen keyboards, mobile input methods like LiteratIM and predictive typing methods like ibus-typing-booster (available for Fedora) were also discussed. Read more about the input method sessions.

The Language Coverage Matrix Dashboard that displays language support status for all languages in Wikimedia projects was showcased. The Fedora Internationalization team, who currently provides resources for fewer languages than the Wikimedia projects, will identify the gap using the LCMD data and assess the resources that can be leveraged for enhancing the support on Desktops. Dr. Kalika Bali from Microsoft Research Labs presented on leveraging content translation platforms for Indian languages and highlighted that for Indic languages MT could be improved significantly by using web-scale content like Wikipedia.

Learn more about the sessions, accomplishments and next steps for these projects from the Event Report.

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering, Wikimedia Foundation

First ever Train-the-Trainer Program in India

Access to Knowledge Programme at the Centre for Internet & Society (CIS-A2K) organized the first ever Train the Trainer Program in India. 20 Wikimedians from 8 different language communities and 10 different cities across India attended CIS-A2K’s Train the Trainer (TTT). The residency program was spread over four days. The event was represented by Wikimedia communities including Bengali, Gujarati, Sanskrit, Malayalam, Hindi, Marathi, Telugu and Odia. The event was organized to help build capacity amongst Wikimedia volunteers to conduct effective and efficient outreach programs in their respective regions in an effort to expand the Wikimedia movement to reach the nooks and crannies of a large nation like India. CIS-A2K realizes that with a small team of five it cannot cover all communities. This program would create leadership, which in turn will hopefully take the movement forward.

Hari Prasad Nadig, one of TTT’s resource persons said, “I think the training program was in the right direction. In fact, I thought it was a very good idea.”

Hari Prasad Nadig, one of TTT’s resource persons and sysop on both Kannada and Sanskrit Wikipedia, said, “I think the training program was in the right direction. In fact I thought it was a very good idea. It falls in-line with what is needed to be done with utmost importance for the Indian Wikipedia community – creating more trainers/mentors who can bring in editors to Wikipedia or guide the existing ones.”

Post-Event Survey & Report

CIS-A2K conducted a post-event survey to evaluate TTT program and also review individual training and development activities organized during the four-day workshop. The main aim of the survey was to understand how the attendees perceived the event and help CIS-A2K plan a more successful and well-attended event in the future.

Including a variety of questions ranging from likert scale questions, drag and drop list, paragraph text, multiple choices, provided an interactive and systematic way to gather participant’s feedback. The survey questions were also designed to cover different aspects of the event including attendee’s opinions of the sessions,  as well as what they learned. Results and findings will be used to refine and improve the next TTT program.

(more…)

Report from the Spring 2013 Open Source Language Summit

Fortuna i forti aiuta, e i timidi rifiuta — an Italian proverb

The Wikimedia Foundation and Red Hat jointly organized the Second Open Source Language Summit on February 12th and 13th, 2013. The summit was held at the Red Hat engineering center in Pune, India. Similar to the previous summit, this face-to-face work session was focused on internationalization (i18n) and localization (l10n) features, font support, input method tools, language search, i18n testing methods and standards. The sessions were work sprints, each with special focus on a key area. Participants included core contributors from the Wikimedia Foundation, Red Hat (including Fedora SIG members), KDE, FUEL, Google and C-DAC. Below is a summary of what was accomplished during these two days.

During the summit, teams from different organizations came together to discuss language-related challenges, and worked together on features and tools to address them.

During the summit, teams from different organizations came together to discuss language-related challenges, and worked together on features and tools to address them.

Input Methods

Parag Nemade and Santhosh Thottingal worked on making additional input methods available for the jQuery.IME library. 60 input methods, covering languages like Assamese, Esperanto, Russian, Greek, Hebrew were added bringing the total to 144. Also IMEs from the m17n library missing from the jQuery.IME library were identified.

Translation tools, translatewiki.net & FUEL Sprint

Siebrand Mazeland and Niklas Laxström, together with Ankit Patel, Rajesh Ranjan and Red Hat language maintainers, worked to identify more tools that could be used as Translation aids in a translation system. The FUEL project aims to standardize translations for frequently used terms, translation style and assessment methodology. Until now it has focused mostly on languages of India. The FUEL project can now be translated in translatewiki.net. Pau Giner demonstrated new designs for the translation editor and terminology usage, remotely from Spain.

Language Coverage Matrix

To better evaluate the needs for enabling support for languages, a matrix detailing the requirements and availability of basic and extended features is being drawn up. With 285 languages currently supported in Wikimedia and more than 100 in Fedora, this document will be instrumental in bridging the gaps and porting features across projects and platforms. Key areas of evaluation include input methods, fonts, translation aids like glossaries and spell-checkers, testing and validation methods, etc. A preliminary draft was created during the summit by Alolita Sharma, Runa Bhattacharjee and Amir E. Aharoni.

Fonts, WebFonts

An initiative to document the technical aspects of fonts for scripts for languages spoken in India started during the language summit. For each of the scripts, a reference font will be chosen and each font will be explained in detail to intersect with the Open Type font specification as a standard. It will aim to act as a reference document for any typographer working on Indian language fonts. Initial draft and outline of this document was prepared during the second day of the language summit, mainly by Santhosh Thottingal and Pravin Satpute.

Testing Internationalization Tools

Finding suitable methods for testing internationalized components and contents was the major focus of this sprint, with the Fedora Localization Testing Group (FLTG) and Wikimedia’s Language Engineering team sharing details of their testing methods. The FLTG conducts Test Days prior to Fedora beta releases with a test matrix targeted at specific core components, and Wikimedia uses unit tests for frequent testing of their development features. The FLTG showed its plans to integrate the screenshot comparison method for testing localized interfaces. This method will be useful for Wikimedia too. Extending the method for web-based applications and Wikimedia’s language requirements (e.g. right-to-left) were identified as areas for collaboration.

More news from the Language Summit can be found in the tweets, the session notes and the full report.

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering