Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Posts Tagged ‘open-source’

Video Labs: Universal Subtitles on Commons

Universal Subtitles Widget Sync Interface

Universal Subtitles synchronisation interface gives subtitle authors fine grained control over subtitle timing.

For the past 6 months the Participatory Culture Foundation has been hard at work on their latest open web video mission to make captioning, subtitling, and translating video publicly accessible in a way that’s free and open. Part of the Mozilla Drum Beat campaign for a better web, Universal Subtitles is a tool and platform to help bring an open solution to subtitling web video. Commons has supported timed text via the mwEmbed gadget for some time, but up until today it has been very difficult to create the initial subtitle track. I have been watching the development of the universal subtitles efforts, and while at the subtitle summit and open video conference we were finally able to hack on bringing the Universal Subtitles widget to Wikimedia Commons.

Today, I am happy to share our first pass at integrating our open subtitle efforts. Please keep in mind this integration is still very early on in development, but the basic milestone of being able to use the tool on commons to create and sync up subtitle tracks is an important first step. Even without helpful tools in place, the Wikimedia community has been creating subtitles and translations. We hope this new subtitle edit tools will broaden the number of participants and enable the Wikimedia community to set a new standard for high quality multilingual accessibility in online video content.

If you have a moment, feel free to check out the widget and provide some feedback. If you are looking for a video to subtitle, check out the recently created needs subtitling category.

Michael Dale, Open Source Video Collaboration Technology

Video Labs: P2P Next Community CDN for Video Distribution

As Wikimedia and the community embark on campaigns and programs to increase video contribution and usage on the site, we are starting to see video usage on Wikimedia sites grow and we hope for it to grow a great deal more. One potential problem with increased video usage on the Wikimedia sites is that video is many times more costly to distribute than text and images that make up Wikipedia articles today. Eventually bandwidth costs could saturate the foundation budget or leave less resources for other projects and programs. For this reason it is important to start exploring and experimenting with future content distribution platforms and partnerships.

The P2P-Next consortium is an EU-funded project exploring the future of Internet video distribution. Their aims are to dramatically reduce the costs of video distribution through community CDNs and P2P technology. They recently presented at Gdansk Wikimania 2010, and today I am happy to invite the Wikimedia community to try out their latest experimental efforts to greatly reduce video distribution costs. Swarmplayer V2.0 is being released today for Firefox (an Internet Explorer plugin is in testing). The Swamplayer enables visitors to easily share their upload bandwidth to help distribute video. The add-on works with the Kaltura HTML5 library ( aka mwEmbed ) and, to enable visitors to help offset distribute costs of any Ogg Theora video embed in any web page.

p2p next desing overview

Swarmplayer next design overview, learn more on

We have enabled this for Wikimedia video via the multimedia beta. Once you installed the add-on any video you view on Wikimedia sites with the multimedia beta enabled will be transparently streamed via bittorrent. The add-on includes simple tools to configure how much bandwidth you use to upload. Even if you upload nothing, using the add-on helps distribute load by playing the video from the P2P network and the local cache on subsequent views. The Swarmplayer has clever performance tuning which downloads high priority pieces over http while getting low priority bits of the video from the bittorrent swarm. This ensures a smooth playback experience while maximizing use of the P2P network. You can learn more about the technology on the Swam player add-on site

The P2P Next Team from Delft University of Technology will be presenting the P2P-Next project at the Open Video Conference on October 2nd.

Michael Dale, Open Source Video Collaboration Technology

Video Labs: Kaltura HTML5 Sequencer available on Wikimedia Commons

sequence drag drop

Screenshot showing a search for cats and drag an image into the sequence

I am happy to invite the Wikimedia community to try out the latest Kaltura HTML5 video sequencer as part of a Wikimedia/Kaltura Video Labs project that can now be used on Wikimedia Commons with resulting sequences visible on any Wikimedia project. For those that have been following the efforts, it has been a long road to  deliver this sequence editing experience within the open web platform and within the MediaWiki platform. This blog post will highlight the foundational technologies in use by the sequencer in its present state and outline some of the upcoming features in Firefox 4, and enhancements to the sequencer itself that are set to improve the editing experience.

If you want to just jump into editing, please check out the commons documentation page and play around with the editor and let us know what you think. This project is early on in its development. Your bug reports,  ideas, feedback and participation will help drive future features and how these tools are used within Wikimedia projects.

If you’re interested in Video on Wikipedia in general, please consider joining the Wikivideo mailing list which will cover a wide range topics, including the sequencer, collaborative subtitles, timed text, video uploading, video distribution, format guidelines, and campaigns to increase video contributions to the site.

And finally, if you are in the New York area consider checking out the Open Video Conference coming up October 1st to the 3nd, which will be a great space to hack on open video and work on ideas for the future of video on Wikimedia projects.


The power of translators

Wikimedia projects support over 270 languages. This amazing global reach is powered by volunteers who translate not only the contents but also text used in MediaWiki so that localized wikis can be easily navigated and operated by users in their local language. is the amazing translation engine which not only supports Wikimedia projects but other open source projects. Siebrand and Nike are leading this translation platforms.

The user experience programs at Wikimedia Foundation is also benefited from and translation volunteers. The usability beta has been completely translated into thirteen languages and twelve languages are 99% complete. These stats can be found at the translation completion status page for the usability extension by courtesy of GeardM.

The usability beta is planned to be switched to be the default interface in April. Additional translation boost for languages which are not fully translated will improve the usability of the new interface greatly.
GerardM had a great example of the interface in Nepali, whose localization is not complete, in his blog.

Translation help for such as Indonesian, Greek, Thai, Arabic, Hebrew, Italian, Sinhala, Korean and much more, are greatly appreciated.

Wikimedia donates servers to deserving non-profits.

Every year, Wikipedia usage goes upward, and every year the technical folks working and volunteering with Wikimedia have to plan, purchase, and implement new servers to keep up to the growing popularity of Wikipedia and its sister projects.  With the advances in computing, running 9 new application servers this year took the load of 36 application servers from 3 years ago.

So when we upgrade, what happens to the old equipment that is too slow for Wikipedia, but not too slow for MANY other non-profits?  We donate them!  These systems were 1U rackmount servers, dual cpu 2.5-3, single core, 2-4GB of RAM, and 2-4 HDD Bays with 1-2 80-250GB HDDs. This year, we have  three non-profits who received our older systems (in alphabetical order):, OpenStreetMap Foundation, and Sugar Labs.

Drupal is a free software package that allows an individual or a community of users to easily publish, manage and organize a wide variety of content on a website. Tens of thousands of people and organizations are using Drupal to power scores of different web sites.

OpenStreetMap Foundation

The OpenStreetMap Foundation is an international non-profit organisation supporting but not controlling the project. It is dedicated to encouraging the growth, development and distribution of free geospatial data and to providing geospatial data for anybody to use and share.

OpenStreetMap is an open initiative to create and provide free geographic data such as street maps to anyone who wants them.

Sugar Labs

The mission of Sugar Labs® is to produce, distribute, and support the use of the Sugar learning platform; it is a support base and gathering place for the community of educators and developers to create, extend, teach, and learn with the Sugar learning platform.

We hope the recipients of our servers will be able to put them to good use!

Below are some common questions involving Wikimedia and the server donation process:

Q. How can I get some of the decommissioned donation servers?

A. The best place to follow the goings on of our technical team is here, on the Wikimedia Technical Blog.  When we have a batch of servers up for decommissioning and donation, we will announce it on the tech blog, and instructions on how to apply to receive some servers.

Q. Who is eligible to apply for servers?

A. We try to only donate servers to other non-profits whose core values are similar or in support of our own.  This means we do not donate them for individual use.   Since these servers were purchased with donations to support Wikimedia, we feel we need to further donate them to other like-minded organizations, since that is how the money for the servers was meant to be spent.

Q. How often does this happen?

A. Most servers are kept in use by Wikimedia beyond three years.  Many of our servers that we have turned off in this batch are anywhere from 3 to 5 years old.  We only replace them when it makes sense from the technical standpoint to do so.  This means we cannot just say ‘we will do this every X months.’  We try to get the most use out of every server, as they were donated or purchased with donations.  So there is no set date, just keep checking the Wikimedia Technical Blog, when we have more to donate, we will say so there!

Q. I am a student/person/so and so, and I want to learn to develop and do such and such.  Can you send me a server?

A. Sorry, unfortunately it is just not realistic or fair of us to try to sort out which personal use requests for servers are legitimate and which are folks wanting computers for any other reason.  We choose to limit our donations to other like minded non-profit organizations.

Rob Halsell
Systems Administrator

SVG Open

I’m hanging out down in Mountain View for the SVG Open conference this weekend, to speak a bit on how we use and plan to use SVG at Wikimedia and to get up to date on the state of the art. I’ll post my full talk slides on Sunday after my talk…

One of the most exciting new developments in the SVG world right now is svgweb, a very cool tool which brings high-quality SVG rendering support — including full support for the SVG DOM and interactivity — to any browser that supports Flash. This essentially fills the “SVG gap” for most Internet Explorer users, which opens up a huge world of possibilities for both interactive content and tools for building, editing, and localizing SVG-based diagrams, charts, maps, etc right in the browser.

Google web standards evangelist Brad Neuberg gave a great talk about the background of how something like svgweb was needed and showed some great demos, including a quick preview of an inline SVG pan-and-zoom tool for Wikipedia / Wikimedia Commons; we’ll have some even funner demos based on that Sunday!

Also saw a good talk from Sam Ruby on some of the gotchas in the current state of HTML vs XHTML vs HTML5 and how SVG is (or isn’t) supported in various profiles and various browsers. Most interesting was his proposal to rethink how we deal with markup validators in the webdev world — right now most validators give you a lot of errors about things that don’t really make a difference (font vs style?), but freely ignore problematic but “legitimate” structures (say, unclosed list items).

SVG for all… with Flash?

For several years, we’ve supported uploading SVG vector images to Wikimedia sites… with the limitation that they would be rendered to static PNG raster images when actually used inline.

This gives our editors great flexibility in editing, customizing, and translating maps and diagrams using cross-platform free tools like Inkscape, but we’re missing out on some of the big potential in SVG — high-quality scaling for zoomed displays and printing, and animation and scripted interactivity.

In large part we can blame Internet Explorer — the most widely used browser has never supported SVG graphics natively, and Adobe isn’t even maintaining their plug-in anymore! With the majority of users cut out, we’ve had little incentive to move forward with new capabilities that would be closed to most visitors.

But that may be changing, thanks to… Flash??

svgweb implements a highly capable SVG renderer in JavaScript and Flash, bringing high-quality, scriptable SVG support to the ~95% of web users who have either Flash or a naitvely SVG-capable browser.

I love to see Flash’s near-ubiquity used for good — implementing support for modern, open web standards on older and less capable browsers.

One of the chief drivers of the project is Google open standards evangelist Brad Neuberg; we had a great talk today along with Trevor on our Usability team and Michael of Metavid/Kaltura/video awesomeness, and we’re all very excited at the possibilities.

We’re going to see if we can whip together some basic integration in time to show at the SVG Open conference in October, starting with a basic zoom-and-pan view for SVG images which can make use of native or emulated SVG support.

Future ideas that have us really excited include:

  • Live previewing of parameterized images at insert time (localized text, highlighted map segments, charts, etc)
  • On-web basic vector image editing? Sometimes you just need to make an adjustment and installing Inkscape is kind of heavyweight.

Pure SVG + Javascript should be able to provide for selecting, moving, adding, and altering objects, which we could then save back to a new version of the file… svgweb’s powerful scripting support should be able to extend this to Internet Explorer users too!

Use of SVG originals inline in article pages is more dependent on file size issues. We have a lot of files that are just plain huge, especially detailed maps, and the SVG version ends up being a lot slower to download and display.

A project which can help with that is Scour, a tool to optimize SVG source by stripping out unneeded verbosity and rearranging style bits to keep size down.

With further work to strip out detail that will never be visible, a filter like this could let us produce output files that are more suitable for on-screen viewing while still scaling up nicely on zoomed displays and printed output.

Brion Vibber, Lead Software Architect

Let’s tango!

The open source movement is not only about software and knowledge base creation. There are active movements in user interface design as well. tango! is one of the neatest projects in design collaborative world, contributing in the creation of open source software such as Open Office and Ubuntu. We, the usability team, also benefit from such open source design projects which allow us to reuse their icons by modifying to meet our needs. For example the icon on the right is the new reference tool icon which can be found in the enhanced toolbar. It is the reuse of Gnome Desktop icons from Wikimedia Commons.

The first set of usability enhancements, new tab layout, enhanced toolbar, and reorganized search page, are now available in MediaWiki projects except for right-to-left language wikis such as Arabic and Hebrew. The support for right-to-left languages should be available in a few weeks, so just hang in there. We welcome you to try out the usability enhancements by going into your preferences and enable ‘Vector’ and the enhanced toolbar from Appearance and Editing menus.

I hope you find the new interface easy to interact. Let us know your feedback in the discussion page of the most recent release page.

Naoko Komura, Program Manager, Usability Initiative

OER Search Discovery – not just another TLA

metaberkmanI’ve spent today in sunny Cambridge, MA attending the OER Search Discovery 2009 workshop at Harvard’s Berkman Center. But what’s it all about?

First off, what’s OER?

Open Educational Resources are a litle tough to really define to everyone’s satisfaction, but we can defer the details. :) We’re generally talking about pedagogical materials (something that could be put to use in the classroom to teach students) available under some sort of open content license.

Secondly, what’s OER search?

Creative Commons’ ccLearn project has put together DiscoverEd, a prototype search engine which includes some relevant metadata (subject matter, language, target age range, license) as well as metadata about which collection of resource links it came from. This is rather clever, allowing teachers or students to limit their searches to what’s relevant as well as what’s trusted.

Third, what’s OER search discovery?

Traditionally, most electronic educational resource collections have been walled silos. Even if the materials themselves are open and redistributable, the collections’ searches are separate, and often there’s been confusion over the openness of the metadata as well which has held back federated searches on a larger scale.

With major search engines like Google and Yahoo now starting to index metadata embedded in web pages (RDFa and/or microformats) and make them available for searches, this a great time to start pushing more active and integrated semantic search data. (Note: here we’re talking about metadata about the actual materials, not about the subject of the materials. That’s a matter for another day!) Content creators — if enabled by content management tool developers — can start actually getting some concrete benefit from embedding semantic data into their web sites. These’ll be picked up by the general search crawlers, but will also be available to targeted repositories collecting links and metadata about educational materials on the web.

How can we benefit?

There are two sides of this which Wikimedia can work at:

  • On the content creation side, we can provide more ways to add useful metadata to our pages, making it easier for teachers and students searching through educatinoal-themed portals to find them. MediaWiki already provides basic language and license information, but projects like WikiBooks and Wikiversity (as well as other MediaWiki users like WikiEducator) could definitely benefit from a consistent way to specify the subject and target audience of lesson modules.
  • On the consumer side, we want to be able to find and use free/open media resources from elsewhere on the web to supplement the ones we already have on Wikimedia Commons. The in-development Add Media Wizard can currently search and fetch from a few hardcoded repositories like and Flickr, but editors could benefit a lot from having either broader (whole internet search like Google Images with license limits) or narrower sources (a particular educational resource repository desired by a given site or community).

How can we help?

  • We’ll want to find a good, clean, maintainable, and easy to use way for wiki page authors to add resource metadata to their pages, which can be exposed to spiders and repository crawlers. RDFa vs microformats vs XHTML vs HTML 5 needs some resolution on the output format, but more interesting is making sure we have a clean user interface/workflow in the edit window without cluttering up the wiki markup.
  • If/when folks standardize on a search query format as well, we can make it absurdly easy to add specific repositories to the MediaWiki media picker. In the meantime, we can target some whole-net search engines that index license and subject metadata such as Yahoo’s SearchMonkey, which will provide relevant indexing of web sites which have provided for metadata autodiscovery with embedded RDFa etc.
  • We might also think about acting as a repository ourself — Wikipedia and our sister projects are full of references to excellent resources both online and off. Can we record what we know about them and make that searchable internally and externally?

Folks at the workshop are also hoping we can agitate for similar moves in other tools… I know I would benefit from a free media picker for WordPress!

Brion Vibber, Lead Software Architect

Firefox 3.5 brings native open video support

Congralutations are in order for our friends and comrades-in-arms at Mozilla: they’ve released version 3.5 of their open-source Firefox browser today.

Aside from major improvements to speed and memory usage, one of the updates that has got us most excited at Wikimedia is the support for HTML 5′s native <video> and <audio> elements.

What does this mean? Well in short, it means that Firefox 3.5 is the best browser to run video and audio clips from Wikimedia Commons on!


A few months more down the line, we’ll start being able to integrate support for our inline video sequencer, which’ll make it easy to extract snippets of a longer video and combine them — entirely using open-source, non-patent-encumbered web standards. This makes heavy use of the new HTML 5 multimedia support; while at first editing will be limited to Firefox 3.5 users, other browsers are continuing to improve and adopt the same support.