top | item 22706677

Google cloud outage

74 points| thomassharoon | 6 years ago |status.cloud.google.com | reply

33 comments

order
[+] qmarchi|6 years ago|reply
Heyo Googler here.

The problem was a mix between another cloud provider and GCP.

Dare I say, there should be little customer impact as of 13:37 PST.....

The status dashboard is going to be your best idea on information.

[+] svacko|6 years ago|reply
Is the another cloud provider AWS? I could see tons of connection timeoutes between GCP & S3/Elasticsearch service.

Hope everything is resolved now for good.

[+] gigatexal|6 years ago|reply
Oh man I had no idea the big cloud providers have dependencies on other clouds like this.
[+] unbeli|6 years ago|reply
[removed]
[+] nammi|6 years ago|reply
We were seeing timeouts in east-1. I don't know what "normal" looks like, but Pingdom's map seems to show the whole east coast as affected https://livemap.pingdom.com/
[+] svacko|6 years ago|reply
yeah, our GKE pods running in us-east1 were dying ~90minutes ago like crazy... hope they are gonna resolve this soon. not the luckiest day for Google, nor us
[+] x__x|6 years ago|reply
I was bummed out when Siteground moved all their cloud accounts over G, without telling their customers beforehand
[+] kgraves|6 years ago|reply
This is extremely concerning as somebody looking to move or build on top of GCP for the long term. I wonder why anyone would choose GCP if outages are occurring on a regular basis.
[+] pgodzin|6 years ago|reply
Any evidence they happen more frequently that the other clouds?
[+] tagux|6 years ago|reply
"We had a router failure in Atlanta".

WHAT? You kidding us?

Urs Hölzle, technical infrastructure at Google Cloud senior vice president, said, "We're very sorry about that! We had a router failure in Atlanta, which affected traffic routed through that region. Things should be back to normal now. Just to make sure: This wasn't related to traffic levels or any kind of overload, our network is not stressed by COVID-19."

[+] ocdtrekkie|6 years ago|reply
Was it like... a hardware failure? If you serve more than 100 people you probably should have redundant routers. Was it a configuration issue that replicated over to multiple devices at least, I hope?