top | item 11031069

(no title)

dmunoz | 10 years ago

I recently stepped into a role with a devops component, and one of my first surprises was just how slow status.aws.amazon.com was to update about ongoing issues. I had to scramble to find twitter and external forums confirmation for the client.

discuss

order

atom_enger|10 years ago

What's even worse is that when Amazon finally updates their status page it's usually still a green icon with a little i tick for "information" even if it was a partial outage. It takes a lot for the icons to go red which is what you'd look for if you're experiencing issues.

I do the same thing, often searching Twitter for "aws" or "outage" and find people complaining about the problem which confirms my suspicions. It's a sad state of affairs when you have to do this and Amazon doesn't seem interested in fixing it.

paulddraper|10 years ago

The most recent issue that affected me was when all EC2 instances in VPCs couldn't connect to S3. At all.

It wasn't indicated on the status page until after it was fixed. And it was indicated as a green check in a sea of green checks. With a small "i" in the corner to represent the outage.

I love AWS. It's not without fault but overall I think it's been well architected, well documented, and well implemented.

But the status page has got to be the ultimate example of what not to do.

click170|10 years ago

If you have a support agreement with them then file a ticket requesting better customer communication and link back here as an example of how to do it right.

I think everyone complains in forums and online but doesn't actually file tickets about it. These things are worth tickets too.

eric_h|10 years ago

day to day i mostly write software, but I also help manage our infrastructure (we're a small company - 9 people total, 4 engineers, I'm one of the 2 that understands managing servers well enough to support it). We were on linode up until about a year and change ago and switched to AWS/Opsworks to both decrease our infrastructure bill and increase our ability to scale horizontally quickly (for unfortunately long definitions of quickly - "running setup...")

Both Linode and Amazon suck at their status pages (though linode was quite informative about their DDoS outages that started on Christmas). Every amazon issue we've had, the status page only changed once they'd more or less fixed it. As far as I'm concerned their status page is basically useless unless it's an extended outage, at which point it's still basically useless...