Wikimedia blog

News from inside the Wikimedia Foundation.org

Posts Tagged ‘open-source’

GLAMCampNYC: help us make mass uploads easier

Today, several Wikimedians and representatives from galleries, libraries, archives and museums (GLAM institutions) met in New York City to kick off GLAMCampNYC.  New York City’s public Science, Industry, and Business Library is hosting the event.

Liam Wyatt, the Wikimedia Foundation’s Cultural Partnerships Fellow (aka GLAM fellow), introduced two keynoters: Meg Bellinger, discussing open access at Yale, and Maarten Zeinstra, presenting the Europeana public domain calculator.  The conference continues through Sunday.  Participants are discussing and building the GLAM outreach wiki, writing documentation, sharing best practices, and building tools.

Developers at GLAMCamp are developing a data-munging tool, based on pywikipediabot, to aid in mass uploads (more details).  According to Wyatt, the most common requests from GLAM institutions are (1) mass upload of audiovisual media and (2) metrics, “easily exportable statistics based on analytics on a GLAM’s relationship with Wikimedia.”  The data-munging or data ingestion tool will aid in the import of metadata from large sets of files, thus speeding the difficult part of mass uploads.  Attendees will be hacking on it in sprints this weekend, starting 3pm-4:30pm UTC time tomorrow, Saturday the 21st. Join them in person (11am local time), or in #glamwiki on Freenode.

See notes from today’s general talks and discussion and from the discussion of the GLAM Ambassadors program, or follow #glamwiki and #glamcamp on Twitter and Identi.ca.

-Sumana Harihareswara
Volunteer Development Coordinator, Wikimedia Foundation

MediaWiki selects eight students for Google Summer of Code 2011

We received more than 25 proposals for this year’s Google Summer of Code, and several mentors put many hours into evaluating project ideas, discussing them with applicants, and making the tough decisions.  Our final choices, the Google Summer of Code students for MediaWiki for 2011:

  • Akshay Agarwal‘s “Account Creation, Login Screens and AJAX-ification of everything” (mentor: Brandon Harris)
  • Kevin Brown’s “Working Archival for Web References/Citations,” “to facilitate the archival of external links used as references in the English Wikipedia” (mentor: Neil Kandalgaonkar)
  • Devayon Das‘s “Improving Semantic Search/Semantic Query usability issues in SMW” (mentor: Markus Krötzsch)
  • Ankit Garg‘s “Semantic Schemas extension” (mentor: Yaron Koren)
  • Salvatore Ingala‘s “AMICUS: Awesome Monolithic Infrastructure for Customization of User Scripts” (mentors: Brion Vibber and Max Semenik)
  • Aigerim Karabekova‘s “Extension Release Management” (mentors: Sam Reed, Priyanka Dhanda, and Chad Horohoe)
  • Yuvi Panda‘s “Making Offline Wikipedia Article Selection Easier with Mediawiki Extensions” (mentor: Arthur Richards)
  • Zhenya Vlasyenko‘s “MediaWiki Extension: SocialProfile – UserStatus feature” (mentor: Jack Phoenix)

You’ll be hearing more about each of these projects in the next few weeks!

Congratulations to this year’s students, and thanks to all the applicants, as well as MediaWiki’s many mentors, developers who evaluated applications, and Google’s Open Source Programs Office.  The accepted students now have a month to ramp up on MediaWiki’s processes and get to know their mentors (the Community Bonding Period) and will start coding their summer projects on or before May 23rd.  As organizational administrator for MediaWiki’s GSoC participation, I’ll be keeping an eye on all eight students and helping them out.

Good luck!

Project ideas, students, and mentors wanted for Google Summer of Code

For the sixth year in a row, Wikimedia is participating in the Google Summer of Code program. Google Summer of Code (GSoC) is a program where Google pays summer students USD 5000 each to hack open source projects during the summer (read more).

Over time, MediaWiki has benefited from GSoC students and their projects. For example, Samuel Lampa’s 2010 RDF import/export extension in Semantic MediaWiki is in use. And Jeroen De Dauw, GSoC student in 2009 and 2010, is now a persistently contributing member of the MediaWiki community, as is Brian Wolff, 2010 GSoC student.

In the past, the administrative and management challenges of GSoC have been an extra task that take engineers’ time, and too often fell through the cracks. So this year, Rob Lanphier asked me to act as organizational administrator for MediaWiki’s involvement, via the Wikimedia Foundation.

I’m recruiting students to apply, getting project ideas, and managing the application process overall. Once we choose the students and they start ramping up and working, I will also help mentors manage their students and keep communication going, to make sure that every GSoC student’s project gets delivered and gets used!

We hope 2011′s students will develop useful chunks of MediaWiki (core, extensions, gadgets, scripts, or utilities), help us get their code shipped, and stay in the MediaWiki community afterwards.

This year’s ideas include writing and implementing cite templates in a PHP extension, improving the ImageTagging extension, XML dump work, pre-commit checks in our code repositories, and more. And of course we want to hear your own ideas, too! Interested?

University, community college, and graduate students around the world are eligible to apply to Google Summer of Code. You don’t need to be a computer science or IT major, and you can work from home.

We are looking for students who already know PHP. It’s also great if you have some experience with LAMP, MAMP, LAPP, or one of those kinds of stacks, and with the Subversion version control system. If you haven’t contributed to MediaWiki before, How to become a MediaWiki hacker is a good place to start.

If you’d like to participate, check out the timeline. Make sure you are available full-time from 23 May till 22 August this summer, and have a little free time from 25 April till 23 May for ramp-up.

If you’re interested, please sign up on our wiki page and start talking with us on IRC in #mediawiki on Freenode about a possible project! Then you can submit your proposal via the official GSoC website. The deadline for you to submit a project proposal is April 8th, but we encourage you to start early and talk with us about your idea first.

And, to repeat what Brion once said:

If you’re an experienced MediaWiki developer and would like to help out with selecting and mentoring student projects, please give us a shout! We’ll take you even if you live in the southern hemisphere. ;) We need folks who’ll be available online fairly regularly over the summer and are knowledgeable about MediaWiki — not necessarily knowing every piece of it, but knowing where to look so you can help the students help themselves.

We’re looking forward to hacking with you!

Sumana Harihareswara
MediaWiki Coordinator, GSoC 2011

Sue Gardner joins Ada Initiative advisory board

Today the Ada Initiative announced the appointment of Sue Gardner, ED of the Wikimedia Foundation, to its first advisory board. The Ada Initiative launched just a few weeks ago, and has the aim of promoting the visibility and participation of women in open-source culture. The group, founded by Valerie Aurora and Mary Gardiner, will undertake unique research in the field of women in open-source culture, provide consultative services to organizations and businesses, and develop training and education services.

The Initiative‘s namesake, Countess Ada Lovelace (10 December 1815 – 27 November 1852), was considered one of the world’s first computer programmers, and was almost certainly the first woman in computer programming. She collaborated with Charles Babbage, the creator of one of the first mechanical computers, the analytical engine, writing what is generally considered the first code instructions for a computer.

From Wikipedia,

She was the only legitimate child of the poet Lord Byron (with Anne Isabella Milbanke), but had no relationship with her father, who died when she was nine. As a young adult she took an interest in mathematics, and in particular Babbage’s work on the analytical engine. Between 1842 and 1843 she translated an article by Italian mathematician Luigi Menabrea on the engine, which she supplemented with a set of notes of her own. These notes contain what is considered the first computer program—that is, an algorithm encoded for processing by a machine. Though Babbage’s engine was never built, Lovelace’s notes are important in the early history of computers. She also foresaw the capability of computers to go beyond mere calculating or number-crunching while others, including Babbage himself, focused only on these capabilities. [1]

Wikipedia has been in the news recently following a New York Times story highlighting the lack of women participating in the project, based on researched gathered by the United Nations University Study.  Interest in the topic has brought new thinkers to the Wikimedia community, which also recently resulted in the creation of a Wikimedia gender gap mailing list, which is open to the public.

Congratulations, Sue, and good luck to everyone involved in the Ada Initiative!

Jay Walsh, Communications

[1] Ada Lovelace. (2011, February 24). In Wikipedia, The Free Encyclopedia. Retrieved 18:03, February 24, 2011, from http://en.wikipedia.org/w/index.php?title=Ada_Lovelace&oldid=415671634

WikiBhasha

Folks over at Microsoft Research have been thinking about ways to improve content translation between instances of Wikipedia.  For example, today the largest collection of articles is at English Wikipedia (more than 3,000,000).  Compare that number with the collection at Hindi Wikipedia (which as of July 31 of this year had 55716).  One proven way to increase the articles in Hindi is machine translation, but such translations still need human review and often subtle editing to make them elegantly readable.

Enter WikiBhasha, formerly known as WikiBABEL, which launches today as both a MediaWiki extension project and a bookmarklet.  WikiBhasha takes content from a targeted Wikipedia page and displays a machine translation to a second language side-by-side.  Users can edit, add to or delete the translated content, preview their work and then submit it to the second language Wikipedia.

What’s especially interesting to me about this project is the fact that its author, researcher A. Kumaran, has tirelessly persuaded Microsoft to allow him to open source the client.  The code has been checked into the MediaWiki code tree under the Apache License 2.0, which means that the powerful side-by-side editing tools developed by Mr. Kumaran can potentially be used in other MediaWiki projects.  I’m very pleased to see Microsoft take this step, and I hope you will join me in welcoming WikiBhasha.

Danese Cooper, Chief Technical Officer

Video Labs: Universal Subtitles on Commons

Universal Subtitles Widget Sync Interface

Universal Subtitles synchronisation interface gives subtitle authors fine grained control over subtitle timing.

For the past 6 months the Participatory Culture Foundation has been hard at work on their latest open web video mission to make captioning, subtitling, and translating video publicly accessible in a way that’s free and open. Part of the Mozilla Drum Beat campaign for a better web, Universal Subtitles is a tool and platform to help bring an open solution to subtitling web video. Commons has supported timed text via the mwEmbed gadget for some time, but up until today it has been very difficult to create the initial subtitle track. I have been watching the development of the universal subtitles efforts, and while at the subtitle summit and open video conference we were finally able to hack on bringing the Universal Subtitles widget to Wikimedia Commons.

Today, I am happy to share our first pass at integrating our open subtitle efforts. Please keep in mind this integration is still very early on in development, but the basic milestone of being able to use the tool on commons to create and sync up subtitle tracks is an important first step. Even without helpful tools in place, the Wikimedia community has been creating subtitles and translations. We hope this new subtitle edit tools will broaden the number of participants and enable the Wikimedia community to set a new standard for high quality multilingual accessibility in online video content.

If you have a moment, feel free to check out the widget and provide some feedback. If you are looking for a video to subtitle, check out the recently created needs subtitling category.

Michael Dale, Open Source Video Collaboration Technology

Video Labs: P2P Next Community CDN for Video Distribution

As Wikimedia and the community embark on campaigns and programs to increase video contribution and usage on the site, we are starting to see video usage on Wikimedia sites grow and we hope for it to grow a great deal more. One potential problem with increased video usage on the Wikimedia sites is that video is many times more costly to distribute than text and images that make up Wikipedia articles today. Eventually bandwidth costs could saturate the foundation budget or leave less resources for other projects and programs. For this reason it is important to start exploring and experimenting with future content distribution platforms and partnerships.

The P2P-Next consortium is an EU-funded project exploring the future of Internet video distribution. Their aims are to dramatically reduce the costs of video distribution through community CDNs and P2P technology. They recently presented at Gdansk Wikimania 2010, and today I am happy to invite the Wikimedia community to try out their latest experimental efforts to greatly reduce video distribution costs. Swarmplayer V2.0 is being released today for Firefox (an Internet Explorer plugin is in testing). The Swamplayer enables visitors to easily share their upload bandwidth to help distribute video. The add-on works with the Kaltura HTML5 library ( aka mwEmbed ) and url2torrent.net, to enable visitors to help offset distribute costs of any Ogg Theora video embed in any web page.

p2p next desing overview

Swarmplayer next design overview, learn more on swarmplayer.p2p-next.org

We have enabled this for Wikimedia video via the multimedia beta. Once you installed the add-on any video you view on Wikimedia sites with the multimedia beta enabled will be transparently streamed via bittorrent. The add-on includes simple tools to configure how much bandwidth you use to upload. Even if you upload nothing, using the add-on helps distribute load by playing the video from the P2P network and the local cache on subsequent views. The Swarmplayer has clever performance tuning which downloads high priority pieces over http while getting low priority bits of the video from the bittorrent swarm. This ensures a smooth playback experience while maximizing use of the P2P network. You can learn more about the technology on the Swam player add-on site

The P2P Next Team from Delft University of Technology will be presenting the P2P-Next project at the Open Video Conference on October 2nd.

Michael Dale, Open Source Video Collaboration Technology

Video Labs: Kaltura HTML5 Sequencer available on Wikimedia Commons

sequence drag drop

Screenshot showing a search for cats and drag an image into the sequence

I am happy to invite the Wikimedia community to try out the latest Kaltura HTML5 video sequencer as part of a Wikimedia/Kaltura Video Labs project that can now be used on Wikimedia Commons with resulting sequences visible on any Wikimedia project. For those that have been following the efforts, it has been a long road to  deliver this sequence editing experience within the open web platform and within the MediaWiki platform. This blog post will highlight the foundational technologies in use by the sequencer in its present state and outline some of the upcoming features in Firefox 4, and enhancements to the sequencer itself that are set to improve the editing experience.

If you want to just jump into editing, please check out the commons documentation page and play around with the editor and let us know what you think. This project is early on in its development. Your bug reports,  ideas, feedback and participation will help drive future features and how these tools are used within Wikimedia projects.

If you’re interested in Video on Wikipedia in general, please consider joining the Wikivideo mailing list which will cover a wide range topics, including the sequencer, collaborative subtitles, timed text, video uploading, video distribution, format guidelines, and campaigns to increase video contributions to the site.

And finally, if you are in the New York area consider checking out the Open Video Conference coming up October 1st to the 3nd, which will be a great space to hack on open video and work on ideas for the future of video on Wikimedia projects.

(more…)

The power of translators

Wikimedia projects support over 270 languages. This amazing global reach is powered by volunteers who translate not only the contents but also text used in MediaWiki so that localized wikis can be easily navigated and operated by users in their local language. Translatewiki.net is the amazing translation engine which not only supports Wikimedia projects but other open source projects. Siebrand and Nike are leading this translation platforms.

The user experience programs at Wikimedia Foundation is also benefited from translatewiki.net and translation volunteers. The usability beta has been completely translated into thirteen languages and twelve languages are 99% complete. These stats can be found at the translation completion status page for the usability extension by courtesy of GeardM.

The usability beta is planned to be switched to be the default interface in April. Additional translation boost for languages which are not fully translated will improve the usability of the new interface greatly.
GerardM had a great example of the interface in Nepali, whose localization is not complete, in his blog.

Translation help for such as Indonesian, Greek, Thai, Arabic, Hebrew, Italian, Sinhala, Korean and much more, are greatly appreciated.

Wikimedia donates servers to deserving non-profits.

Every year, Wikipedia usage goes upward, and every year the technical folks working and volunteering with Wikimedia have to plan, purchase, and implement new servers to keep up to the growing popularity of Wikipedia and its sister projects.  With the advances in computing, running 9 new application servers this year took the load of 36 application servers from 3 years ago.

So when we upgrade, what happens to the old equipment that is too slow for Wikipedia, but not too slow for MANY other non-profits?  We donate them!  These systems were 1U rackmount servers, dual cpu 2.5-3, single core, 2-4GB of RAM, and 2-4 HDD Bays with 1-2 80-250GB HDDs. This year, we have  three non-profits who received our older systems (in alphabetical order): Drupal.org, OpenStreetMap Foundation, and Sugar Labs.

Drupal.org

Drupal is a free software package that allows an individual or a community of users to easily publish, manage and organize a wide variety of content on a website. Tens of thousands of people and organizations are using Drupal to power scores of different web sites.

OpenStreetMap Foundation

The OpenStreetMap Foundation is an international non-profit organisation supporting but not controlling the project. It is dedicated to encouraging the growth, development and distribution of free geospatial data and to providing geospatial data for anybody to use and share.

OpenStreetMap is an open initiative to create and provide free geographic data such as street maps to anyone who wants them.

Sugar Labs

The mission of Sugar Labs® is to produce, distribute, and support the use of the Sugar learning platform; it is a support base and gathering place for the community of educators and developers to create, extend, teach, and learn with the Sugar learning platform.

We hope the recipients of our servers will be able to put them to good use!

Below are some common questions involving Wikimedia and the server donation process:

Q. How can I get some of the decommissioned donation servers?

A. The best place to follow the goings on of our technical team is here, on the Wikimedia Technical Blog.  When we have a batch of servers up for decommissioning and donation, we will announce it on the tech blog, and instructions on how to apply to receive some servers.

Q. Who is eligible to apply for servers?

A. We try to only donate servers to other non-profits whose core values are similar or in support of our own.  This means we do not donate them for individual use.   Since these servers were purchased with donations to support Wikimedia, we feel we need to further donate them to other like-minded organizations, since that is how the money for the servers was meant to be spent.

Q. How often does this happen?

A. Most servers are kept in use by Wikimedia beyond three years.  Many of our servers that we have turned off in this batch are anywhere from 3 to 5 years old.  We only replace them when it makes sense from the technical standpoint to do so.  This means we cannot just say ‘we will do this every X months.’  We try to get the most use out of every server, as they were donated or purchased with donations.  So there is no set date, just keep checking the Wikimedia Technical Blog, when we have more to donate, we will say so there!

Q. I am a student/person/so and so, and I want to learn to develop and do such and such.  Can you send me a server?

A. Sorry, unfortunately it is just not realistic or fair of us to try to sort out which personal use requests for servers are legitimate and which are folks wanting computers for any other reason.  We choose to limit our donations to other like minded non-profit organizations.

Rob Halsell
Systems Administrator