Ever started searching for something on Wikipedia and wondered—really, is that all that there is? Does it feel like you’re somehow playing hide and seek with all the knowledge that’s out there?
Wouldn’t it be great to see articles or categories that are similar to your search query and maybe some related images or links to other languages in which to read that article? Or, perhaps you just want to read and contribute to projects other than Wikipedia but need a jump start with a few short summaries from sister projects.
Even if you simply enjoy seeing interesting snippets and images, based off of your search query, then you’ll really like what we have in store. We’re starting to test out some really cool features that will enable some fun and fascinating clicking—down the rabbit hole of Wikipedia. But first, let’s look at what we’ve been doing over the last couple of years.
Back end search
The Discovery Search team has been doing tons of work creating, updating, and finessing the search back end to enhance search queries. There have been many complex things that have happened, things like: adding ascii-folding and stemming, detecting when a visitor might be typing in a language that is different than the Wikipedia that they are on, switching from tf-idf to BM25, dropping trailing question marks, and updating to ElasticSearch version 5. Whew!
We have much more planned in the coming months—machine learning with ‘learning to rank’, investigating and deploying new language analyzers, and, after doing an exhaustive analysis, removing quotes within queries. We’ll also be interacting closely with the new Structured Data team in their upcoming work on Commons to make freely licensed images accessible and reusable across the web.
Front end search
After all that back end search awesomeness, we needed to spruce up the part that the majority of our readers and editors actually interface with: the search results page! We started brainstorming during the late summer of 2016 on what we could do to make search results better—to easily find interesting, relevant content and to create a more intuitive viewing experience. We designed and refined numerous ideas on how to improve the search results page and received lots of good feedback from the community.
Empowered by that feedback, we began testing, starting with a display of results from the Wikimedia sister projects next to the regular search results. The idea for this test was to enable discovery into other projects—projects that our visitors might not have known about—by displaying interesting results in small snippets. The sidebar display of the sister projects borrows from a similar feature that is already in use on the Italian, Catalan and French Wikipedias. We’ve run a couple tests on the sister project search results with detailed analysis completed and, after a bit of final touches to the code, we will release the new functionality into production on all Wikipedias near the end of April 2017.
The sister projects are an integral part of the Wikimedia family and the associated links denoting each project are often found near the footer of the front page of each Wikipedia. The Wikimedia sister projects are:
Our next test will be to add in additional information and related results for each search query. This will be in the form of an ‘explore similar’ link that, when someone interacts with the link, an expanded display will appear with related pages, categories and links to the article in other languages—all of which might lead to further knowledge discovery. We know that not every search query will return exactly what folks were looking for, but we feel that adding links to similar, but related information would be helpful and, possibly, super interesting!
We also plan on doing a few more tests in the coming year:
- Test a new display that will show the pronunciation of a word with its definition and part of speech—all from existing data in Wiktionary. Initially, this will be in English only.
- Test placing a small image (from the article) next to each search result that is displayed on the page.
- Test an additional feature that will use a new metadata display in the search box that is located on the top right of most pages in Wikipedia, similar to what happens on the Wikipedia.org portal page when a user starts typing into the search box.
For the more technical minded, there is a way to test out these new features in your own browser. For the sister project search results, it will require a bit of URL manipulation; but for the explore similar and Wiktionary widget, you’ll need a Wikipedia account and be able to create (or edit) your common.js file. Detailed information is available on Mediawiki.
Once the testing, analysis and feedback cycle is done for each new feature, we’d like to slowly implement them into production on all Wikipedias throughout the rest of the year. We’re really hoping that these enhancements will deepen the usefulness of search results and enable our readers and editors to be even more productive and inspired!
Deborah Tankersley, Product Manager, Discovery Product and Analysis