top | item 20077421

Google Cloud Is Down

1395 points| markoa | 6 years ago

https://status.cloud.google.com

https://status.cloud.google.com/incident/compute/19003

Status page reports all green, however the outage is affecting YouTube, Snapchat, and thousands of other users.

593 comments

order
[+] boulos|6 years ago|reply
Disclosure: I work on Google Cloud (but disclaimer, I'm on vacation and so not much use to you!).

We're having what appears to be a serious networking outage. It's disrupting everything, including unfortunately the tooling we usually use to communicate across the company about outages.

There are backup plans, of course, but I wanted to at least come here to say: you're not crazy, nothing is lost (to those concerns downthread), but there is serious packet loss at the least. You'll have to wait for someone actually involved in the incident to say more.

[+] boulos|6 years ago|reply
To clarify something: this outage doesn’t appear to be global, but it is hitting us particularly hard in parts of the US. So for the folks with working VMs in Mumbai, you’re not crazy. But for everyone with sadness in us-central1, the team is on it.
[+] odiroot|6 years ago|reply
> including unfortunately the tooling we usually use to communicate across the company about outages.

There's some irony in that.

[+] ohazi|6 years ago|reply
> including unfortunately the tooling we usually use to communicate across the company about outages.

So memegen is down?

[+] ChuckMcM|6 years ago|reply
I'm guessing this will be part of the next DiRT exercise :-) (DiRT being the disaster recovery exercises that Google runs internally to prepare for this sort of thing)
[+] SmokeGS|6 years ago|reply
>nothing is lost

except time

[+] foobarbazetc|6 years ago|reply
Seems to be the private network. The public network looks fine to us from all over the world?
[+] Yrlec|6 years ago|reply
Now is a good time to point out that the SLA of Google Cloud Storage only covers HTTP 500 errors: https://cloud.google.com/storage/sla. So if the servers are not responding at all then it's not covered by the SLA. I've brought this to their attention and they basically responded that their network is never down.
[+] based2|6 years ago|reply
[+] _Marak_|6 years ago|reply
This should be voted higher up.

According to https://twitter.com/bgp4_table, we have just exceeded 768k Border Gateway Protocol routing entries, which may be causing some routers to malfunction.

[+] juanuys|6 years ago|reply
Will this affect more than just Google? I haven't seen any outages from other cloud providers.
[+] tntn|6 years ago|reply
There goes 3 nines for June and for Q2. I guess everyone gets a 10% discount for the month? https://cloud.google.com/compute/sla
[+] OkGoDoIt|6 years ago|reply
Remember to request the credit!

From that linked page:

"Customer Must Request Financial Credit

In order to receive any of the Financial Credits described above, Customer must notify Google technical support within thirty days from the time Customer becomes eligible to receive a Financial Credit. Customer must also provide Google with server log files showing loss of external connectivity errors and the date and time those errors occurred. If Customer does not comply with these requirements, Customer will forfeit its right to receive a Financial Credit. If a dispute arises with respect to this SLA, Google will make a determination in good faith based on its system logs, monitoring reports, configuration records, and other available information, which Google will make available for auditing by Customer at Customer’s request."

[+] gundmc|6 years ago|reply
A couple more hours and everyone will get 25% off for June.
[+] londons_explore|6 years ago|reply
The discount seems way too small.

I would pay a premium for a cloud provider happy to give 100 percent discount for the month for 10 minutes downtime, and 100 percent discount for the year for an hour's downtime.

[+] w_s_l|6 years ago|reply
You know this reminds me of a bad taste that Google Sales team left when I asked for some of my billing that I was unaware of running after following a quickstart guide.

AWS refunded me in the first reply on the same day!

GCP sales rep just copy pasted a link to a self support survey that essentially told me, after a series of YES or NO questions that they can't refund me.

So why not just tell your customers like it is? Google Cloud is super strict when it comes to billing. I have called my bank to do a chargeback and put a hold on all future billing with GCP.

I'm now back to AWS and still on a Free Tier. Apparently the $300 Trial with Google Cloud did not include some critical products, AWS Free tier makes it super clear and even still I sometimes leave something running on and discover it in my invoice....

I've yet to receive a reply from Google and its been a week now.

I do appreciate other products such as Firebase but honestly for infrastructure and for future integration with enterprise customers I feel AWS is more appropriate and mature.

[+] mcintyre1994|6 years ago|reply
The thing that worries me most about Google Cloud and these billing stories is that I’m assuming if you chargeback or block them at your bank then they’ll ban all Google accounts of yours - and they’re obviously going to be able to make the link between an account made just for Google Cloud and my real account.
[+] lucb1e|6 years ago|reply
Are you seriously complaining about having to pay for using their resources? I understand that you're surprised some things aren't covered in the free trial or free credit or whatever, but getting $300 free already sounded a little too good to be true (I heard about it from a friend and was dubious; at least in Europe, consumers are told not to enter deals that are too good to be true), you could at least have checked what you're actually getting.

I think it's weird to say you get credit in dollars and then not be able to spend it on everything. That's not how money works. But that's the way hosting providers work and afaik it's quite well known. Especially with a large sum of "free money", even if it's not well known, it was on you to check any small print.

[+] kerng|6 years ago|reply
Google is well known for not caring about small shops, only if you are a multi million dollar customer with dedicated account manager you can expect reasonable support. That's been the case forever with them.
[+] bscphil|6 years ago|reply
>I asked for some of my billing that I was unaware of running

>I have called my bank to do a chargeback

You're issuing a chargeback because you made a mistake and spent someone else's resources? And you're admitting to this on HN? I'm not a lawyer, but that sounds like fraud and / or theft to me.

[+] espeed|6 years ago|reply
What was the quickstart guide?
[+] WC3w6pXxgGd|6 years ago|reply
Anything created in-house at Google (GCP) is typically created by technically-proficient devs, those devs then leave the project to start something new and maintenance is left to interns and new hires. Google customer service basically doesn't care and also has no tools at their disposal to fix any issues anyway.

The infinite money spout that is Google Ads has created a situation in which devs are at Google just to have fun - there really is no incentive to maintain anything because the money will flow regardless of quality.

Source: I interned at Google.

[+] ksajadi|6 years ago|reply
GCP status page is worthless as it's always happy and green when production systems are down and then they might acknowledge something an hour later
[+] colinbartlett|6 years ago|reply
Google Cloud is the number 4 most monitored status page on StatusGator and Google Apps is number 12. In addition, at least 20 other services we monitor seemingly depend on Google Cloud because they all posted issues as soon as Google went down.

It's always interesting to see these outages at large cloud providers spider out across the rest of the internet, a lot of the world depends on Google to stay up.

[+] nabla9|6 years ago|reply
This feels like 80's.

When the mainframe is down terminals are useless.

[+] hhs|6 years ago|reply
"a lot of the world depends on Google to stay up."

Yup, I'm trying to check the Associated Press News right now and it's having trouble connecting to "storage.googleapis.com".

[+] FPGAhacker|6 years ago|reply
I guess we know what steam uses (the store at least).
[+] hazeii|6 years ago|reply
...and only the paranoid survive?
[+] macintux|6 years ago|reply
And thus was ruined hundreds or thousands of pleasant Sunday afternoons.

I don’t miss being on pager duty one bit. I see it looming in my headlights, sadly.

[+] xerxes901|6 years ago|reply
Spare a thought for the pleasant Australian early Monday mornings too! Always a rude awakening...
[+] newsbinator|6 years ago|reply
It's the Queen's birthday, a Monday off here in New Zealand...

... but not for everybody now.

[+] jacques_chester|6 years ago|reply
The only response is to wait for Google to fix it.

Nothing you or I or the pager can do will speed that up.

I am aware some bosses won't believe that and I am not trying to make light of it. But there really isn't much else to do except wait.

[+] jagtesh|6 years ago|reply
Multi-cloud for those times when you really need that level of availability and can afford it.
[+] _xerxes_|6 years ago|reply
Nest is down too, not surprising given they are part of Google. What I don't understand is why I can't still control my devices over my local network. Why does the system even require access to Google servers?
[+] titzer|6 years ago|reply
This is your yearly reminder to resist centralization of the internet.
[+] londons_explore|6 years ago|reply
It seems the AdWords anti-spam system is down, which means anyone can put a billion dollar bid on every keyword and get their ads showing on every Google search for every query.

Systems that fail 'open'...

[+] sarim|6 years ago|reply
Funny how as soon as I realized that Gmail and Google Sheet aren’t working properly I rushed to HN to figure out what’s going on. I love this community!
[+] squarefoot|6 years ago|reply
And Gmail too doesn't feel very well today.

  [21:55:19] POP< +OK send PASS
  [21:55:19] POP> PASS ********
  [21:55:21] POP< +OK Welcome.
  [21:55:21] POP> STAT
  [21:55:21] POP< -ERR [SYS/TEMP] Temporary system problem.
  Please try again later.
[+] klon|6 years ago|reply
Anyone using both AWS and GCP that can form an opinion on availability of both? As a GCP customer I am not very happy with theirs.
[+] different_sort|6 years ago|reply
I was playing around this afternoon with appengine, and thought I broke one my projects when I started getting 502 back.

There appears to be some irregularities on consumer services as well that are of course certainly related, youtube was behaving a bit oddly for me.

The impact seems to be cascading down from just GCE to other services as well - that status page certainly does not reflect the reality of the situation. You can't even sign into GCP right now, and things that run on GCE, like appengine seem impacted.