The longest six hours in Facebook’s history took place on October 4, 2021, as Facebook and its sister properties went dark. The social network suffered a catastrophic outage. The only silver lining to the outage, if there is one, is that the outage wasn’t caused by malicious actors. Rather, it was a self-inflicted wound caused by Facebook’s own network engineering team.
According to the first engineering blog post from Facebook on October 4, they fingered “configuration changes on the backbone routers that coordinated network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt.”
They followed up their blog post on October 5 with more details: “A command was issued with the intention to assess the availability of global backbone capacity, which unintentionally took down all connections in our backbone network, disconnecting Facebook data centers globally.” The blog explained how their systems have fail-safe processes in place to prevent this type of mistake, but “a bug in that audit tool prevented it from properly stopping the command.”
Yes, yet another instance where the machines turned out to be the insider that caused the havoc.