Had a meeting where developers were discussing the infrastructure for an application. A crucial part of the whole flow was completely dependant on an AWS service. I asked if it was a single point of failure. The whole room laughed, I rest my case.
Similar experience here. People laughed and some said something like "well, if something like AWS falls then we have bigger problems". They laugh because honestly is too far-fetched to think the whole AWS infra going down. Too big to fail as they say in the US. Nothing short of a nuclear war would fuck up the entire AWS network so they're kinda right.
Until this happen. A single region in a cascade failure and your saas is single region.
They’re not wrong though. If AWS goes down, EVERYTHING goes down to some degree. Your app, your competitor’s apps, your clients’ chat apps. You’re kinda off the hook.
If you were dependent upon a single distribution (region) of that Service, yes it would be a massive single point of failure in this case. If you weren't dependent upon a particular region, you'd be fine.
Of course lots of AWS services have hidden dependencies on us-east-1. During a previous outage we needed to update a Route53(DNS) record in us-west-2, but couldn't because of the outage in us-east-1.
Relying on AWS is a single point of failure. Not as much as relying on a single AWS region, but it's still a single point.
It's fairly difficult to avoid single points of failure completely, and if you do it's likely your suppliers and customers haven't managed to.
It's about how much your risk level is.
AWS us-east-1 fails constantly, it has terrible uptime, and you should expect it to go. A cyberattack which destroyed AWSs entire infrastructure would be less likely. BGP hijacks across multiple AWS nodes are quite plausible though, but that can be mitigated to an extent with direct connects.
Sadly it seems people in charge of critical infrastructure don't even bother thinking about these things, because next quarters numbers are more important.
I can avoid London as a single point of failure, but the loss of Docklands would cause so much damage to the UK's infrastructure I can't confidently predict that my servers in Manchester connected to peering points such as IXman will be able to reach my customer in Norwich. I'm not even sure how much international connectivity I could rely on. In theory Starlink will continue to work, but in practice I'm not confident.
When we had power issues in Washington DC a couple of months ago, three of our four independent ISPS failed, as they all had undeclared requirements on active equipment in the area. That wasn't even a major outage, just a local substation failure. The one circuit which survived was clearly just fibre from our (UPS/generator backed) equipment room to a data centre towards Baltimore (not Ashburn).
aeve890|4 months ago
Until this happen. A single region in a cascade failure and your saas is single region.
stephenlf|4 months ago
lxgr|4 months ago
Elidrake24|4 months ago
zimbu668|4 months ago
ta1243|4 months ago
It's fairly difficult to avoid single points of failure completely, and if you do it's likely your suppliers and customers haven't managed to.
It's about how much your risk level is.
AWS us-east-1 fails constantly, it has terrible uptime, and you should expect it to go. A cyberattack which destroyed AWSs entire infrastructure would be less likely. BGP hijacks across multiple AWS nodes are quite plausible though, but that can be mitigated to an extent with direct connects.
Sadly it seems people in charge of critical infrastructure don't even bother thinking about these things, because next quarters numbers are more important.
I can avoid London as a single point of failure, but the loss of Docklands would cause so much damage to the UK's infrastructure I can't confidently predict that my servers in Manchester connected to peering points such as IXman will be able to reach my customer in Norwich. I'm not even sure how much international connectivity I could rely on. In theory Starlink will continue to work, but in practice I'm not confident.
When we had power issues in Washington DC a couple of months ago, three of our four independent ISPS failed, as they all had undeclared requirements on active equipment in the area. That wasn't even a major outage, just a local substation failure. The one circuit which survived was clearly just fibre from our (UPS/generator backed) equipment room to a data centre towards Baltimore (not Ashburn).
wubrr|4 months ago
antinomicus|4 months ago
dvsgaevsvsgavsv|4 months ago
steveBK123|4 months ago