Postmortems

Two 2012 post-mortems from Cloudflare:

,

Two 2012 post-mortems from Cloudflare: both look like procedural problems (pushing bad routing tables and deleting a master database)

The linked one (BGP trouble): http://blog.cloudflare.com/todays-outage-post-mortem
The other one (DNS trouble): http://blog.cloudflare.com/post-mortem-the-ugly-the-bad-the-good

“The result was traffic from around the world was directed to the Hong Kong data center, which was offline.” With all of the complexity of BGP, I’m surprised that failures like this don’t happen more often.