Google cloud outage

[+] qmarchi|6 years ago|reply

Heyo Googler here.

The problem was a mix between another cloud provider and GCP.

Dare I say, there should be little customer impact as of 13:37 PST.....

The status dashboard is going to be your best idea on information.

[+] svacko|6 years ago|reply

Is the another cloud provider AWS? I could see tons of connection timeoutes between GCP & S3/Elasticsearch service.

Hope everything is resolved now for good.

[+] unknown|6 years ago|reply

[deleted]

[+] gigatexal|6 years ago|reply

Oh man I had no idea the big cloud providers have dependencies on other clouds like this.

[+] the-dude|6 years ago|reply

This can't be real.

[+] unbeli|6 years ago|reply

[removed]

[+] nammi|6 years ago|reply

We were seeing timeouts in east-1. I don't know what "normal" looks like, but Pingdom's map seems to show the whole east coast as affected https://livemap.pingdom.com/

[+] svacko|6 years ago|reply

yeah, our GKE pods running in us-east1 were dying ~90minutes ago like crazy... hope they are gonna resolve this soon. not the luckiest day for Google, nor us

[+] x__x|6 years ago|reply

I was bummed out when Siteground moved all their cloud accounts over G, without telling their customers beforehand

[+] optimal_alex|6 years ago|reply

[deleted]

[+] kgraves|6 years ago|reply

This is extremely concerning as somebody looking to move or build on top of GCP for the long term. I wonder why anyone would choose GCP if outages are occurring on a regular basis.

[+] pgodzin|6 years ago|reply

Any evidence they happen more frequently that the other clouds?

[+] tagux|6 years ago|reply

"We had a router failure in Atlanta".

WHAT? You kidding us?

Urs Hölzle, technical infrastructure at Google Cloud senior vice president, said, "We're very sorry about that! We had a router failure in Atlanta, which affected traffic routed through that region. Things should be back to normal now. Just to make sure: This wasn't related to traffic levels or any kind of overload, our network is not stressed by COVID-19."

[+] thedance|6 years ago|reply

Wrong outage.

[+] neonate|6 years ago|reply

https://twitter.com/uhoelzle/status/1243217659690278912

[+] ocdtrekkie|6 years ago|reply

Was it like... a hardware failure? If you serve more than 100 people you probably should have redundant routers. Was it a configuration issue that replicated over to multiple devices at least, I hope?

33 comments