Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Open Web Analytics 1.4

Open Web Analytics 1.4.0rc3 is out!  You probably don’t care, do you?  You should!  At least we do!

Anyway, let’s start in the beginning:

As we strategized about future development of Wikimedia properties, it became abundantly clear that the measurement tools that we have are insufficient to make the decisions we need to make.  This was a key recommendation from the Strategy task force. We evaluated several possible analytics frameworks as a supplement or even replacement for our homegrown system(s).  After evaluating a couple of open source solutions (while keeping an open mind about the possible need to go with a proprietary solution), we decided to try out Open Web Analytics (OWA) for this year’s fundraiser, with the goal of evaluating it for broader use.

OWA is a PHP-based analytics tool which provides very sophisticated capabilities for real-time data analysis, providing many tools offered by proprietary counterparts. For us, OWA seems to hit the right balance of flexibility and scalability, with the added benefit that there was already an integration plugin for MediaWiki.  Over the past few months, we’ve been working with Peter Adams, the designer of OWA, to adapt OWA for our needs and to make sure that it would work at the scale that we operate at.

Many of the features in the 1.4 release were made initially for our use, but are general-purpose features that many OWA users should be able to benefit from.  We wanted to track how successful we were at getting people from banners, to letter, to donation, so Peter added a couple of features called “conversion goal tracking” and “goal funnels” which will help us figure out where people might be dropping off, but can also be used for general conversion analysis on any OWA-enabled site.  We also needed to keep track of all of this on a per-banner basis, as well as knowing whether the user clicked on the banner or on the “Donate” link in the sidebar, so the “campaign tracking” feature was added.

Finally, we needed to deploy many instances of OWA, so clustered deployment was added in this release.  Peter worked with Nimish Gautam here at WMF to make OWA more scalable, with Nimish becoming a committer on OWA. Peter focused on the architecture, while Nimish focused on making sure that all of the work integrated seamlessly into Wikimedia’s environment.

We’ve just deployed OWA for purposes of observing traffic patterns for the fundraiser, and we’ll be reporting on how well it works for us.  We’re not using all of the features; for example, we’ve disabled features such as mouse movement recording/playback.  We’re being very careful to respect everyone’s privacy and stay true to the WMF donor privacy policy and the Wikimedia privacy policy

We believe the work we’ve done is generally applicable to anyone who wants MediaWiki analytics, and we’re eager to see how it works for others.  We are also at a point where we would love help with testing this.

8 Responses to “Open Web Analytics 1.4”

  1. tomasz says:

    @Roger We hadn’t known about ‘X-Do-Not-Track’ during our evaluation but we’ll certainly keep an eye on it for future OWA releases.

  2. Roger Knott says:

    Any chance you’ll support X-Do-Not-Track: Yes in your web analytics? See donottrack.us for the spec.

  3. Tim says:

    @Nick mongo db is indeed webscale :)

    Described in detail in the educational video:
    http://nosql.mypopescu.com/post/1016320617/mongodb-is-web-scale

  4. Nick says:

    @Nimish I know the Piwik team mentioned in the forums that they are working on mongodb support for scaling – but i’m not sure if it’s going to be released soon or not

  5. Nimish says:

    @Nick: We considered PiWik and did see some positives to it, but based on some of our specific needs (such as scaling based on our existing infrastructure and our particular form of e-commerce integration) and the timeframe in which we needed them, we ended up going with OWA.

  6. @1: Hi Osama, we very much want to make sure we maintain some of the best privacy practices on the web. We’ve not rolled out any of this stuff beyond the fundraising funnel (which is really about tracking the path from the banner to the credit card transaction, not about reading habits). Much of our work already has been centered around making sure we’re not collecting anything we don’t need. We’re still trying to figure out if/how we’ll be rolling these tools out for wider use, and we plan to make sure we’re very clear prior to deployment about the information we want to collect and how we collect it, so that everyone can have an informed conversation about what safeguards we need to put in place, and what features we might need to disable. Watch this blog for further updates.

    @2: Hi Nick, there are a number of factors that went into our decision to try OWA rather than Piwiki. The short answer is that, based on my understanding of things, OWA stores the data in a way that gives us more flexibility in analyzing the data. We’ll ask Nimish and/or Peter to weigh in with more detail.

  7. Nick says:

    I’m surprised you didn’t go with Piwik, it is such a better solution! it has funnels, goal tracking and more… and it seems more active than OWA too

  8. Osama Khalid says:

    Well, hmm. I’m not very comfortable with anyone tracking my moves on WIkipedia. I’m afraid that Eben Moglen’s words behind Congress may not apply any more. He said (03:40:16):

    I, very much doubt, Mr. Chairman that there is any person in this room whose life hadn’t been altered by Wikipedia, [...] Wikipedia is unsuppoered by advertisements and of the top hundred websites on the Net [it was] the only one of the one hundred not in any way surveilling its users.

    I also wonder why the Foundation would advertise its intention to look its users’ data into a proprietary system. I mean really, is this by any means a way to respect privacy?
    Anyway, I’m fine with storing anonymous data about page visits in an autonomous environment protected by the non-profit Wikimedia Foundation. I think we can comprise very little on privacy and autonomy.

Leave a Reply