Postmortems

See PDF for Google's incident report for this week's outage and disruption affecting gmail,

See PDF for Google’s incident report for this week’s outage and disruption affecting gmail, gdrive, apps and account management:
“On April 16, a misconfiguration of this user authentication system caused a fraction of the login requests to be unintentionally concentrated on a relatively small number of servers.”

Seems like a thorough five level response: fix the problem, improve alerts, more monitoring, more documentation, refine the system’s own response to damage.

“Google Apps was inaccessible to some customers for a couple of hours Wednesday morning but has since recovered.
The outage affected the Google Apps Admin control panel, an online tool used by Google’s business customers to manage corporate Apps accounts, and other Apps services. It lasted from 5:20 a.m. Pacific Time to 7:59 a.m. Pacific Time, according to Google’s App Status Dashboard.”

via http://www.google.com/appsstatus#hl=en&v=issue&ts=1366257599000&iid=f1583d7e731ba748b2c0ff847868a813
http://static.googleusercontent.com/external_content/untrusted_dlcp/www.google.com/en//appsstatus/ir/hitkhk1bn78lyci.pdf