One data center. Still waiting for 100 machines to reboot... not best practice.

One data center. Still waiting for 100 machines to reboot… not best practice. A “routine scheduled switch to the backup generator this morning at 2:30am caused a fire that destroyed both the backup and the primary. Firefighters took a while to extinguish the fire. Power is now back up and 400 out of the 500 servers rebooted, still waiting for the last 100 to have the whole system fully functional.”

Originally shared by Alexandre Keledjian

At least, we’re still safe, the rise of the machines is not for today

That’s not an IT problem. It’s a design and management problem.

Maybe outsource that “data center” to people who know how to run one?

@Adam_Liss Maybe, but one can still have a tenancy strategy that would be resilient against this kind of thing.

Not to be too cynical, but this is a story reported by a forum member (you’ve been on the Internet, right?) relayed from a Delta captain (who is probably no expert in data center power and fire-suppression technologies) who heard it from, who, his dispatcher at ATL? who heard it from his buddy in IT that thinks is bosses are a bunch of idiots…

I’m not sure I believe much of this game of telephone beyond “yup, something happened in a datacenter somewhere, maybe Atlanta.”

I see Delta have an open position for Crisis Manager in IT:
https://delta.greatjob.net/jobs/JobDescRequestAction.action?PSUID=b8774482-09ff-46ad-8254-c9cbe0462057
(via Reddit)

@Dustin_Mitchell Even so, if Delta’s practices were much better than what’s reported widely in the tech news, I suppose they would have bothered to issue a correction…?

I don’t see a PR upside to a company correcting a technical inaccuracy like that. The downside is keeping the story current . So I’m not surprised it wasn’t loudly corrected (and who knows, Ars may have gotten a complaint from Delta and decided not to update the story).

I suppose it’s different for tech companies vs. airlines :).