top | item 4178651

Amazon RDS Multi A-Z didn't work today, what can I do?

2 points| gmedel | 13 years ago | reply

Some RDS machines of us-east-1 went down today. And I was like, Who cares? I have Multi A-Z so I'm covered. Well I was not... Multi A-Z didn't work, the machine kept booting on the same availability zone and it was impossible to get access to it. According to Amazon's Console this was due to "Timed out waiting for a state safe to initiate user requested failover" . And that made me think. Accordingly from what I've read online, "Multi A-Z makes a hot copy of the production instance. If this instance fails In three minutes you'll be in a 'backup instance' exactly as the production instance". But what if I'm writing something to the DB and the disk fails? The database will get corrupted, and If they're really making a hot copy of the disk, how can they (Amazon) ensure the customer that the data of the "backup instances" will not get corrupted? Amazon's support team couldn't give me an answer. But they told me that they will get in touch with me when they get more information from the RDS engineering team. I also asked WHY Multi A-Z didn't work, and they didn't have an answer for that either. Does anyone know HOW does Multi A-Z really work, and how does amazon's ensure consistency in data? What will be the best infrastructure to get high-availability on databases in an Amazon environment? Thank you so much!

discuss

order

No comments yet.