Wikipedia Adopts MariaDB

Translate This Post

This past Wednesday marked a milestone in the evolution of Wikimedia’s Database infrastructure: the completion of the migration of the English and German Wikipedias, as well as Wikidata, to MariaDB 5.5.
For the last several years, we’ve been operating the Facebook fork of MySQL 5.1 with most of our production environment running a build of r3753. We’ve been pleased with its performance; Facebook’s MySQL team contains some of the finest database engineers in the industry and they’ve done much to advance the open source MySQL ecosystem.
That said, MariaDB’s optimizer enhancements, the feature set of Percona’s XtraDB (many overlap with the Facebook patch, but I particularly like add-ons such as the ability to save the buffer pool LRU list, avoiding costly warmups on new servers), and of Oracle’s MySQL 5.5 provide compelling reasons to consider upgrading. Equally important, as supporters of the free culture movement, the Wikimedia Foundation strongly prefers free software projects; that includes a preference for projects without bifurcated code bases between differently licensed free and enterprise editions. We welcome and support the MariaDB Foundation as a not-for-profit steward of the free and open MySQL related database community.

Preparing For Change

Major version upgrades of a production database are not to be made lightly. In fact, as late as 2011, some Wikipedia languages were still running a heavily patched version of MySQL 4.0 — the migration to 5.1 required both schema changes, and direct modifications of data dumps to alter the padding of binary-typed columns. MySQL 5.5 contains a variety of incompatibilities with prior versions, thanks in part to better compliance with SQL standards. Changes to the query optimizer between versions may also change the execution plan for common queries, sometimes for the better but historically, sometimes not. SQL behavior changes may result in replication breakage or data consistency issues, while performance regressions, whether from query plan or other changes, can cause site outages. This calls for a lot of testing.
Compatibility testing was accomplished by running MariaDB replicas outside of production, watching for replication errors, replaying production read queries and validating results. After identifying and fixing a couple of MediaWiki issues that surfaced as replication errors (along the lines of trying to set unsigned integer types to negative values which previously caused a wrap-around instead of an error) we replayed production read queries using pt-upgrade from Percona Toolkit. Pt-upgrade replays a query log against two servers, and compares the responses for variances or errors. Scripts originally developed for our recent datacenter migration to simultaneously warmup many standby databases from current production read traffic helped with rough load testing and benchmarking. Along the way, a pair of bugs in MariaDB 5.5.28 and 5.5.29 were identified, one of which was a rare but potentially severe performance regression related to a new query optimizer feature. The MariaDB team was very responsive and quick to offer solutions, complete with test cases.

Performance Testing In Production

As a read-heavy site, Wikipedia aggressively uses edge caching. Approximately 90% of pageviews are served entirely from the edge while at the application layer, we utilize both memcached and redis in addition to MySQL. Despite that, the MySQL databases serving English Wikipedia alone reach a daily peak of ~50k queries/second. Most are read queries served by load-balanced slaves, depending on consistency requirements. 80% of the English Wikipedia query load (up to 40k qps) are typically handled by just two database servers at any given time. Our most common query type (40% of all) has a median execution time of ~0.2ms and a 95th percentile time of ~50ms. To successfully use MariaDB in production, we need it to keep up with the level of performance obtained from Facebook’s MySQL fork, and to behave consistently as traffic patterns change.

Ishmael views of pt-query-digest data collected via tcpdump for the most common Wikipedia read queries (pdf). The first page of a query shows data from db1042, running mysql-facebook-r3753, the second from db1043 over the same time period, running MariaDB 5.5.30.
Ishmael views of pt-query-digest data collected via tcpdump for the most common Wikipedia read queries (pdf). The first page of a query shows data from db1042, running 5.1fb-r3753, the second from db1043 over the same time period, running MariaDB 5.5.30.

Once confident that application compatibility issues were solved and comfortable with performance obtained under benchmark conditions, it was time to test in production. One of the production read slaves from the English Wikipedia shard was taken out of rotation, upgraded to MariaDB 5.5.30, and then returned for warmup. The load balancer weight was then gradually increased until it and a server still running MySQL 5.1-facebook-r3753 were equally weighted and receiving most of the query load.
Also from the Percona Toolkit, we use pt-query-digest across all database servers to collect query performance data which is then stored in a centralized database. Query data is collected from two sources per server and stored in separate buckets — from the slow query which only captures queries exceeding 450ms, and from periodic brief sampling of all queries obtained by tcpdump. Ishmael provides a convenient way to visualize and inspect query digest data over time. Using it, along with direct analysis of the raw data, allowed us to validate that every query continued to perform within acceptable bounds.
For our most common query type, 95th percentile times over an 8-hour period dropped from 56ms to 43ms and the average from 15.4ms to 12.7ms. 50th percentile times remained a bit better with the 5.1-facebook build over the sample period, 0.185ms vs. 0.194ms. Many query types were 4-15% faster with MariaDB 5.5.30 under production load, a few were 5% slower, and nothing appeared aberrant beyond those bounds.
From there, we upgraded the remaining slaves one by one, before finally rotating in a newer upgraded class of servers to act as masters. The switch was seamless and performance continues to look good. We’ll be completing the migration of shards covering the rest of our projects over the next month. Beyond that, we’re looking forward to the future release of MariaDB 10 (global transaction IDs!), and are continually assessing ways to improve our data storage infrastructure. If you’re interested in helping, the Wikimedia Foundation is hiring!
Asher Feldman, Site Architect

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

62 Comments
Inline Feedbacks
View all comments

That is really a wonderfull job, no doubt!
But, I’m a bit curious, as this migration seems not to be an easy job, did you benchmarked for some NoSQL or graph database on the go ? I’m not sure about all the datas to say that it would have been a interesting solution but reading you I was wondering If you did and what you think about it?
Thanks, and keep up the good work! 🙂

[…] is a nice blog post from Asher Feldman, Site Architect, Wikipedia on how Wikipedia Adopts MariaDB. If you’re using English or German Wikipedia, or using Wikidata, you’re currently being […]

And what about MySQL 5.6?

MySQL 5.6 really is exciting and brings with it new features that I’ve long wanted. Some of those are addressed by MariaDB 5.5 and many of the rest are coming in MariaDB 10. I think the MariaDB team is putting good thought into the design of those features vs. simply porting (see for example http://kristiannielsen.livejournal.com/17008.html). That, plus the few small steps Oracle has made away from community transparency (http://mysqlha.blogspot.com/2013/04/good-news-mysql-5611-is-here.html) make me feel comfortable with selecting MariaDB as Wikimedia’s preferred branch. And since MySQL 5.6 has had some initial performance (vs. 5.5) and stability issues, waiting to adopt doesn’t seem to… Read more »

Very good news ! It’s been a long way for MariaDB.
@LM : MySQL 5.6 is not as powerful as MariaDB 5.5 (http://blog.mariadb.org/sysbench-oltp-mysql-5-6-vs-mariadb-10-0/). And mostly, the difference will very probably be even bigger in the future.
By the way the free and open politic of MariaDB is more what Wikipédia look for. Sadly MySQL has been sabotaged by Oracle, they are not really “Open Source addict”.

PiT: Such a change would likely necessitate a complete re-write of the wiki code-base, and result in a complicated upgrade/migration process, almost certainly involving a significant chunk of downtime to manage.

PiT: MediaWiki and its extension ecosystem (Wikipedia runs over 90, many of which are community developed) very much use a relational model. As Twirrim noted above, it would be a far more complicated engineering effort to move away from an RDBMS, while MariaDB meets most of our needs. That said, I definitely see a graph database in our future for Wikidata. We also use InnoDB as a sharded key-value store to store the gzip’d full text of every revision of every article across all of our projects. I’ve considered NoSQL stores for that part of our infrastructure and wouldn’t rule… Read more »

Good job. A nice success story. I am sure there are plenty more details of the small edge cases in query performance differs, configuration tuning etc that would be good to read about in a more technical review.

[…] Blog Wikimedia […]

[…] Wikimedia объявила о завершении процесса миграции первичной серверной […]

[…] Wikipedia adopts MariaDB. Again, this is important and its also important that we have the MariaDB Foundation, considering the first link in this post. […]

[…] Wikimedia anunció la finalización de la primera fase de la migración de la infraestructura de servidores de bases […]

[…] Wikipedia se pasa a MariaDB […]

Why not PostgreSQL? Or What does MariaDB have over PostgreSQL that made you decide Maria DB? Was it wanting to remain in a MySQL universe?

@Mr Phi:
Hmm, one can object to this benchmark, as MySQL 5.6 is much more powerful than MariaDB 5.5 (http://dimitrik.free.fr/blog/archives/2013/02/mysql-performance-mysql-56-vs-mysql-55-vs-mariadb-55.html)
Just my 2c

[…] bodies though that are supporting MariaDB. WikiMedia announced on April 22 that they has completed the migration of the English and German Wikipedias to MariaDB. Why? Because, MariaDB is […]

[…] en vez de MySQL. Y hoy una de las webs con más tráfico del mundo, la Wikipedia, anuncia que migran su sistema de base de datos en las versiones inglesa y alemana desde un fork hecho por Facebook para MySQL 5.1 a MariaDB 5.5. […]

[…] for another fork of MySQL maintained by Facebook. Wikimedia Foundation Site Architect Asher Feldman announced the adoption of MariaDB in a post yesterday, saying that the move was in part because of “a preference for (open source) […]

[…] bodies though that are supporting MariaDB. WikiMedia announced on April 22 that they has completed the migration of the English and German Wikipedias to MariaDB. Why? Because, MariaDB is […]

[…] bodies though that are supporting MariaDB. WikiMedia announced on April 22 that they has completed the migration of the English and German Wikipedias to MariaDB. Why? Because, MariaDB is […]

[…] bodies though that are supporting MariaDB. WikiMedia announced on April 22 that they has completed the migration of the English and German Wikipedias to MariaDB. Why? Because, MariaDB is […]

[…] bodies though that are supporting MariaDB. WikiMedia announced on April 22 that they has completed the migration of the English and German Wikipedias to MariaDB. Why? Because, MariaDB is […]

[…] Wikipedia Adopts MariaDB — Wikimedia blog This past Wednesday marked a milestone in the evolution of Wikimedia’s Database infrastructure: the completion of the migration of the English and German Wikipedias, as well as Wikidata, to MariaDB 5…. […]

@PiT – MariaDB is a drop-in replacement for MySQL, so I assume that the transition was relatively straightforward. Switching to a NoSQL or graph database would have required a major rewrite.

[…] Server sowie Wikidata von MySQL 5.1 auf MariaDB 5.5 umgestellt. Das geht aus einem ausführlichen Blogbeitrag des für die Migration verantwortlichen Asher Feldman […]

I also wonder, why you didn’t switch to PostgreSQL. Did you consider/benchmarked it? And if, what were the results?
kind regards

[…] Zeit und wechselt von MySQL zu MariaDB. Wie der für die Migration verantwortliche Asher Feldman im Wikimedia-Blog berichtet, wurden die deutschen und britischen Datenbank-Server der Wikipedia bereits vor Tagen von […]

[…] Más información – Wikipedia se cambia de MySQL a MariaDB, openSUSE abandona MySQL y se pasa a MariaDB, Arch Linux también se pasa a MariaDB Fuente – Anuncio oficial […]

Wikipedia’s data model doesn’t need transactions or more advanced features, so PostgreSQL isn’t needed. Like MySQL, MariaDB isn’t a RDBMS, only a toy database for simple tasks where, for example, referential integritiy and transactions aren’t needed.

@Nathan Broadbent: Not just rewriting the code, but some changes in the developers crew is also needed…

[…] wie openSUSE und Fedora. Die deutschen und britischen Sites von Wikipedia laufen seit wenigen Tagen auf dieser […]

[…] également lu un article (que je vous conseil) qui explique que MariaDB arrive même a dépasser les performances du MySQL […]

[…] 甲骨文与开源社区关系的恶化正促使许多发行版用MariaDB取代MySQL,现在使用MySQL的流行网站维基百科也宣布从MySQL迁移到MariaDB,部分原因是甲骨文的MySQL有免费版和企业版,而企业版的新功能没有公开源码,而MariaDB则没有这方面的问题。 […]

[…] A  blog post by project lead Asher Feldman has announced that the Wikimedia Foundation migrated the English and German language editions of Wikipedia, as well as Wikidata, from MySQL 5.1 to MariaDB 5.5 a few days ago. […]

[…] 甲骨文与开源社区关系的恶化正促使许多发行版用MariaDB取代MySQL,现在使用MySQL的流行网站维基百科也宣布从MySQL迁移到MariaDB,部分原因是甲骨文的MySQL有免费版和企业版,而企业版的新功能没有公开源码,而MariaDB则没有这方面的问题。 […]

[…] 甲骨文与开源社区关系的恶化正促使许多发行版用MariaDB取代MySQL,现在使用MySQL的流行网站维基百科也宣布从MySQL迁移到MariaDB,部分原因是甲骨文的MySQL有免费版和企业版,而企业版的新功能没有公开源码,而MariaDB则没有这方面的问题。 […]

[…] Is a drop in replacement of MySQL and large companies are starting to adopt MariaDB such as Wikipedia (Read the blog post here). […]

[…] commercial version but not the free version was never going to win favor in that scene. Wikipedia migrated to MariaDB in the last few days, and the Fedora and OpenSUSE Linux distros will both make the jump in their […]

[…] el blog oficial de la Wikimedia Foundation se explica que durante años han estado usando un fork de MySQL 5.1 realizado por Facebook en sus servidores […]

[…] Um post de Asher Feldman, líder de projeto, anunciou que a Wikimedia Foundation migrou as edições em alemão e em inglês da Wikipedia, assim como o Wikidata, do MySQL 5.1 para o MariaDB 5.5 há alguns dias. […]

[…] الإشارة إلى أن ويكيبيديا أعلنت مؤخرا عن تخليها عن نسخة مطورة من طرف فيس بوك من نظام MySql […]

[…] week Wikimedia foundation fully migrated their English and German Wikipedias over to […]

[…] líder de projeto Asher Feldman, anunciou em um post no blog da Wikimedia, que a esta terminou a alguns dias atrás a migração suas edições da Wikipedia em […]

[…] billet de blog par le chef de projet Asher Feldman a annoncé que la Fondation Wikimedia a migré les éditions en […]

[…] الشركة شرحت في تدوينة طويلة العديد من الأمور التي خططت لها قبل عملية الإنتقال وبإمكانكم قرائتها عبر هذا الرابط. […]

[…] الشركة شرحت في تدوينة طويلة العديد من الأمور التي خططت لها قبل عملية الإنتقال وبإمكانكم قرائتها عبر هذا الرابط. […]

[…] Wikipedia Adopta MariaDB […]

[…] الشركة شرحت في تدوينة طويلة العديد من الأمور التي خططت لها قبل عملية الإنتقال وبإمكانكم قرائتها عبر هذا الرابط. […]

[…] Wikipedia Adopta MariaDB […]

[…] billet de blog par le chef de projet Asher Feldman a annoncé que la Fondation Wikimedia a migré les éditions en […]