For those out of the loop, this week has seen an outage from a bunch of major US internet companies: Cloudflare, Slack, Google Cloud, Azure, Facebook (including Instagram and WhatsApp), and now iCloud.
AMD has been shipping their new Epyc Rome server CPUs to hyperscalars for a few weeks now [1]. It seems to be a very appealing chip that a lot of them are adopting. They won't be the same as whatever they were using before, so it could be that major updates are needed to include them, meaning that it's just a risky time right now?
Nope, just a typical network hardware failure in a west coast datacenter, which triggered a cascading BGP failure disrupting much of the network connectivity in that DC.
You know the old adage that celebrities die in threes? It's actually mathematically supported... or, well, it's supported that they die in 2.718s. Same principle would apply to cloud service outages if all the services and their failures were actually independent. We'd expect them to happen in "clusters" of e:
Actually there are not many actual conspiracies listed there.
I will start :
- All the companies in the list are getting rid of their Huawei dependencies, replacing routers by domestic ones.
- China is flexing its muscles in the trade war : look, we are creating some instability in your e-commerce. Wait until we attack your infra ( electricity, water )
Maybe some submarines are cutting some deep sea cables? In actuality it's probably massive backbone upgrades that are classified, the NSA has to obtain real-time data from somewhere.
Just wondering out aloud rather than floating any theories, if it is in any way related to the 6.4 magnitude earthquake and/or geological events leading upto it. Also inter/un-related, Google provides Apple with the infrastructure for iCloud; they also experienced some downtime in the last few days, along with others i.e. tremors causing some sensitive/critical servers to misbehave.
However, I doubt that it is down to these factors, as there will likely be significant amount of distributed fault tolerance, failover and contingency plans in place.
Apple runs their own bare-metal services - only some of which are hosted on Google Cloud. You could see which ones went down when Google did the other week.
I'd like to think the simplest explanation to this recent spate of outages is that all the engineers at each org spent too long reading the HN comment thread on the previous company's outage, and didn't notice their own servers catching fire ;)
Little off-topic, but lately i often find myself Airdropping files between my iPad & Macbook because iCloud isn't syncing newly added files for some reason (which can't be forced/refreshed on the iPad as far as i know.)
Are there any graphs anywhere showing Internet traffic at different ISPs and networks? Such a graph (especially over a world map) would make it obvious if there was any DDoS, SSH bruteforcing, or other monkey business going on.
I wish there was a historical view. Very curious if this is common, or if it's unusual (which would make it especially unusual with the Slack, Facebook, etc. outages recently).
I'm actually relieved to hear there were a bunch of known problems…earlier today I was wondering why none of my notes and photos were syncing and thought iCloud was loosing its marbles. It used to happen a lot to me, but in the last year or so it's been pretty stable. I was afraid sync reliability was regressing!
I suspect Apple's making upgrades as part of the push towards iOS 13 and macOS Catalina, and ran into some rollout glitches.
One of the services listed is Screen Time. Maybe I'm mistaken but isn't that the feature on iPhones where you can see how much time you spend in each app? Why would that require Apple's servers.
In India, I'm experiencing issues with online payment from last few days. Majorly with UPI payments.
Not sure if these are related but hard to ignore it.
[+] [-] Meekro|6 years ago|reply
[+] [-] mcqueenjordan|6 years ago|reply
[+] [-] walrus01|6 years ago|reply
[+] [-] tootie|6 years ago|reply
[+] [-] flylib|6 years ago|reply
[+] [-] OrgNet|6 years ago|reply
[+] [-] saltedshiv|6 years ago|reply
[+] [-] basch|6 years ago|reply
[+] [-] techrich|6 years ago|reply
[+] [-] aristophenes|6 years ago|reply
[1] Second question in https://www.anandtech.com/show/14568/an-interview-with-amds-...
[+] [-] pilif|6 years ago|reply
And an hour later, I'm reading that most of iCloud is down. I really hope this is a funny coincidence. And if not, then I'm terribly sorry.
[+] [-] therein|6 years ago|reply
All partitions are more or less separated and shouldn't affect one another. For that reason, it probably wasn't you that broke the whole system.
Even the Cassandra they use for metadata storage is partitioned and separated.
[+] [-] postmortimus|6 years ago|reply
[+] [-] jokoon|6 years ago|reply
[+] [-] mandeepj|6 years ago|reply
[+] [-] mr_sturd|6 years ago|reply
[+] [-] OrgNet|6 years ago|reply
[+] [-] your_bully|6 years ago|reply
[+] [-] envolt|6 years ago|reply
https://news.ycombinator.com/item?id=20345060 (Facebook, Instagram, and WhatsApp outages)
[+] [-] 3JPLW|6 years ago|reply
http://ssp.impulsetrain.com/celebrities.html
I still love me a good conspiracy theory, but clustering of random (poisson) events is much more likely than you'd expect.
[+] [-] the-dude|6 years ago|reply
I will start :
- All the companies in the list are getting rid of their Huawei dependencies, replacing routers by domestic ones.
- China is flexing its muscles in the trade war : look, we are creating some instability in your e-commerce. Wait until we attack your infra ( electricity, water )
[+] [-] cantbecool|6 years ago|reply
[+] [-] ksec|6 years ago|reply
I don't even know if fourth of Fifth is confirmation.
[+] [-] the-dude|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] johnnycab|6 years ago|reply
However, I doubt that it is down to these factors, as there will likely be significant amount of distributed fault tolerance, failover and contingency plans in place.
https://earthquake.usgs.gov/earthquakes/eventpage/ci38443183...
[+] [-] Raphmedia|6 years ago|reply
There were some reports of earthquake lights[1][2][3] and earthquake clouds[4][5][6] recently.
[1] https://en.wikipedia.org/wiki/Earthquake_light
[2] https://www.mirror.co.uk/news/weird-news/men-film-glowing-sn...
[3] https://www.dailystar.co.uk/news/weird-news/789760/Colorado-...
[4] https://en.wikipedia.org/wiki/Earthquake_cloud
[5] https://weather.com/vertical/video/strange-clouds-seen-in-co...
[6] https://www.denverpost.com/2019/06/20/boulder-rare-helix-clo...
[+] [-] luckydata|6 years ago|reply
[+] [-] jondwillis|6 years ago|reply
What earthquake related causes could have had effects 24h or more ahead of time?
[+] [-] cameronbrown|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] pcora|6 years ago|reply
[+] [-] Buge|6 years ago|reply
[+] [-] mehrshad|6 years ago|reply
[+] [-] samcday|6 years ago|reply
[+] [-] ryanmarsh|6 years ago|reply
[+] [-] SpaceManNabs|6 years ago|reply
[+] [-] ohnope|6 years ago|reply
[+] [-] kemals|6 years ago|reply
[+] [-] DavideNL|6 years ago|reply
[+] [-] snazz|6 years ago|reply
[+] [-] ceejayoz|6 years ago|reply
[+] [-] jaredcwhite|6 years ago|reply
I suspect Apple's making upgrades as part of the push towards iOS 13 and macOS Catalina, and ran into some rollout glitches.
[+] [-] thekyle|6 years ago|reply
[+] [-] 3xblah|6 years ago|reply
[+] [-] kburman|6 years ago|reply
[+] [-] fc_barnes|6 years ago|reply
[+] [-] jwr|6 years ago|reply
[+] [-] mlosapio|6 years ago|reply
[+] [-] rahuldottech|6 years ago|reply