A postmortem for another App Engine outage (2011-03-08): this one due to a Python runtime software update: “While the new Python runtime
contained no known issues, a performance optimization in a system
update pushed on March 3rd included a bug which would cause future
updates to App Engine runtimes to disrupt running applications as the
new runtime rolled out.”
There’s another App Engine postmortem at
https://groups.google.com/forum/#!msg/google-appengine/p2QKJ0OSLc8/7MtZ3YC9TqQJ
for a two-hour outage 2010-02-24:
“We failed to plan for the case of a power outage that might affect
some, but not all, of our machines in a datacenter (in this case,
about 25%). In particular, this led to incorrect analysis of the
serving state of the failed datacenter and when it might recover”
Previously in this community, the 2 July 2009 outage, caused by a GFS bug:
https://plus.google.com/u/0/110357001884194145645/posts/BWVkgofMMKt
https://groups.google.com/forum/?fromgroups=#!topic/google-appengine/jJ0aRAvRJeY