Surviving Amazon’s Cloudpocalypse

Two weeks ago, our Crowd Fusion team was right in the middle of the big cloud outage at Amazon. All of the big brands using our platform run on Amazon servers.

George Reese from O’Reilly had the best early recap and perspective of the dozens of stories I read:

If you think this week exposed weakness in the cloud, you don’t get it: it was the cloud’s shining moment, exposing the strength of cloud computing.

In short, if your systems failed in the Amazon cloud this week, it wasn’t Amazon’s fault. You either deemed an outage of this nature an acceptable risk or you failed to design for Amazon’s cloud computing model. The strength of cloud computing is that it puts control over application availability in the hands of the application developer and not in the hands of your IT staff, data center limitations, or a managed services provider.

Here are some excerpts from our story in the trenches:

After informing the Amazon representative that we had failed over to the west coast and that we no longer needed this running instance, he urged us to decommission all the US East instances that we were not using in order to free capacity in that region.

He was impressed that we had successfully failed over to the US West region when so many others were still down and said: “You were one of the very few to have a west coast contingency plan and recover quickly. Bravo.”

Read our very detailed Crowd Fusion cloudpocalypse story here.

Published by Brian Alvey

I build software that makes creative people more powerful.

%d bloggers like this: