top | item 45650183

(no title)

julianozen | 4 months ago

There was a very large outage back in ~2017 that was caused by DynamoDB going down. Because EC2 stored its list of servers in DynamoDB, EC2 went down too. Because DynamoDB ran its compute on EC2, it was suddenly no longer able to spin up new instances to recover.

It took several days to manually spin up DynamoDB/EC2 instances so that both services could recover slowly together. Since then, there was a big push to remove dependencies between the “tier one” systems (S3, DynamoDB, EC2, etc.) so that one system couldn’t bring down another one. Of course, it’s never foolproof.

discuss

planckscnst|4 months ago

I don't remember an event like that, but I'm rather certain the scenario you described couldn't have happened in 2017.

The very large 2017 AWS outage originated in s3. Maybe you're thinking about a different event?

https://share.google/HBaV4ZMpxPEpnDvU9

julianozen|4 months ago

Sorry the 2015 one. I misremembered the year

https://aws.amazon.com/message/5467D2/

I imagine this was impossible in 2017 because of actions taken after the 2015 incident