Search restored after leap second bug

At midnight UTC on July 1, Wikimedia’s search cluster stopped working. A “leap second” inserted by the NTP daemon at that time caused Java processes to lock up, including our Lucene search system. The same bug affected many other websites. Our engineers restored service in less than two hours.

Leap seconds are added to our clocks once every few years so that the sun will be directly overhead of the Royal Observatory in Greenwich at precisely 12:00. Some people believe that the desire to keep these two time standards synchronised is anachronistic, and that it would be better to let them drift apart for 600 years and then add a single “leap hour”. I’m sure many computer engineers would breathe a sigh of relief if such a change were implemented.

Tim Starling, Lead Platform Architect

Categories: Operations, Outage, Technology

2 Show

2 Comments on Search restored after leap second bug

duplicatebug 4 years

Google has a way to avoid this bug, named “leap smear”:

But this maybe not help by java.

Guy Manningham 4 years

Apparently a lot of websites experienced this problem. Glad to see you guys got it fixed. I would have been completely lost if it happend to my website.

Leave a Reply

Your email address will not be published. Required fields are marked *