Open Web Analytics 1.4

Translate This Post

Open Web Analytics 1.4.0rc3 is out!  You probably don’t care, do you?  You should!  At least we do!
Anyway, let’s start in the beginning:
As we strategized about future development of Wikimedia properties, it became abundantly clear that the measurement tools that we have are insufficient to make the decisions we need to make.  This was a key recommendation from the Strategy task force. We evaluated several possible analytics frameworks as a supplement or even replacement for our homegrown system(s).  After evaluating a couple of open source solutions (while keeping an open mind about the possible need to go with a proprietary solution), we decided to try out Open Web Analytics (OWA) for this year’s fundraiser, with the goal of evaluating it for broader use.
OWA is a PHP-based analytics tool which provides very sophisticated capabilities for real-time data analysis, providing many tools offered by proprietary counterparts. For us, OWA seems to hit the right balance of flexibility and scalability, with the added benefit that there was already an integration plugin for MediaWiki.  Over the past few months, we’ve been working with Peter Adams, the designer of OWA, to adapt OWA for our needs and to make sure that it would work at the scale that we operate at.
Many of the features in the 1.4 release were made initially for our use, but are general-purpose features that many OWA users should be able to benefit from.  We wanted to track how successful we were at getting people from banners, to letter, to donation, so Peter added a couple of features called “conversion goal tracking” and “goal funnels” which will help us figure out where people might be dropping off, but can also be used for general conversion analysis on any OWA-enabled site.  We also needed to keep track of all of this on a per-banner basis, as well as knowing whether the user clicked on the banner or on the “Donate” link in the sidebar, so the “campaign tracking” feature was added.
Finally, we needed to deploy many instances of OWA, so clustered deployment was added in this release.  Peter worked with Nimish Gautam here at WMF to make OWA more scalable, with Nimish becoming a committer on OWA. Peter focused on the architecture, while Nimish focused on making sure that all of the work integrated seamlessly into Wikimedia’s environment.
We’ve just deployed OWA for purposes of observing traffic patterns for the fundraiser, and we’ll be reporting on how well it works for us.  We’re not using all of the features; for example, we’ve disabled features such as mouse movement recording/playback.  We’re being very careful to respect everyone’s privacy and stay true to the WMF donor privacy policy and the Wikimedia privacy policy
We believe the work we’ve done is generally applicable to anyone who wants MediaWiki analytics, and we’re eager to see how it works for others.  We are also at a point where we would love help with testing this.

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

8 Comments
Inline Feedbacks
View all comments

Well, hmm. I’m not very comfortable with anyone tracking my moves on WIkipedia. I’m afraid that Eben Moglen’s words behind Congress may not apply any more. He said (03:40:16): I, very much doubt, Mr. Chairman that there is any person in this room whose life hadn’t been altered by Wikipedia, […] Wikipedia is unsuppoered by advertisements and of the top hundred websites on the Net [it was] the only one of the one hundred not in any way surveilling its users. I also wonder why the Foundation would advertise its intention to look its users’ data into a proprietary system.… Read more »

I’m surprised you didn’t go with Piwik, it is such a better solution! it has funnels, goal tracking and more… and it seems more active than OWA too

@1: Hi Osama, we very much want to make sure we maintain some of the best privacy practices on the web. We’ve not rolled out any of this stuff beyond the fundraising funnel (which is really about tracking the path from the banner to the credit card transaction, not about reading habits). Much of our work already has been centered around making sure we’re not collecting anything we don’t need. We’re still trying to figure out if/how we’ll be rolling these tools out for wider use, and we plan to make sure we’re very clear prior to deployment about the… Read more »

@Nick: We considered PiWik and did see some positives to it, but based on some of our specific needs (such as scaling based on our existing infrastructure and our particular form of e-commerce integration) and the timeframe in which we needed them, we ended up going with OWA.

@Nimish I know the Piwik team mentioned in the forums that they are working on mongodb support for scaling – but i’m not sure if it’s going to be released soon or not

@Nick mongo db is indeed webscale 🙂
Described in detail in the educational video:
http://nosql.mypopescu.com/post/1016320617/mongodb-is-web-scale

Any chance you’ll support X-Do-Not-Track: Yes in your web analytics? See donottrack.us for the spec.

@Roger We hadn’t known about ‘X-Do-Not-Track’ during our evaluation but we’ll certainly keep an eye on it for future OWA releases.