top | item 46029908

The Cloudflare outage might be a good thing

276 points| radeeyate | 3 months ago |gist.github.com

202 comments

order

krick|3 months ago

It would be a good thing, if it would cause anything to change. It obviously won't. As if a single person reading this post wasn't aware that the Internet is centralized, and couldn't name specifically a few sources of centralization (Cloudflare, AWS, Gmail, Github). As if it's the first time this happens. As if after the last time AWS failed (or the one before that, or one before…) anybody stopped using AWS. As if anybody could viably stop using them.

testdelacc1|3 months ago

If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you. If you host on AWS and “the internet goes down”, then customers treat it akin to an act of God, like a natural disaster that affects everyone.

It’s not great being down for hours, but that will happen regardless. Most companies prefer the option that helps them avoid the ire of their customers.

Where it’s a bigger problem is when a critical industry like retail banking in a country all choose AWS. When AWS goes down all citizens lose access to their money. They can’t pay for groceries or transport. They’re stranded and starving, life grinds to a halt. But even then, this is not the bank’s problem because they’re not doing worse than their competitors. It’s something for the banking regulator and government to worry about. I’m not saying the bank shouldn’t worry about it, I’m saying in practice they don’t worry about it unless the regulator makes them worry.

I completely empathise with people frustrated with this status quo. It’s not great that we’ve normalised a few large outages a year. But for most companies, this is the rational thing to do. And barring a few critical industries like banking, it’s also rational for governments to not intervene.

ectospheno|3 months ago

I’m pretty cloudflare centric. I didn’t start that way. I had services spread out for redundancy. It was a huge pain. Then bots got even more aggressive than usual. I asked why I kept doing this to myself and finally decided my time was worth recapturing.

Did everything become inaccessible the last outage? Yep. Weighed against the time it saves me throughout the year I call it a wash. No plans to move.

captainkrtek|3 months ago

> It would be a good thing, if it would cause anything to change. It obviously won't.

I agree wholeheartedly. The only change is internal to these organizations (eg: CloudFlare, AWS) Improvements will be made to the relevant systems, and some teams internally will also audit for similar behavior, add tests, and fix some bugs.

However, nothing external will change. The cycle of pretending like you are going to implement multi-region fades after a week. And each company goes on continuing to leverage all these services to the Nth degree, waiting for the next outage.

Not advocating that organizations should/could do much, it's all pros/cons. But the collective blast radius is still impressive.

stingraycharles|3 months ago

It’s just a function of costs vs benefits. For most people, building redundancy at this layer costs far too much than the benefits.

If Cloudflare or AWS go down, the outage is usually so big that smaller players have an excuse and people accept that.

It’s as simple as that.

“Why isn’t your site working?” “Half the internet is down, here read this news article: …” “Oh, okay, let me know when it’s back!”

GuB-42|3 months ago

Same idea with the Crowdstrike bug, it seems like it didn't have much of on effect on their customers, certainly not with my company at least, and the stock quickly recovered, in fact doing very well. For me, it looks like nothing changed, no lessons learned.

ehhthing|3 months ago

With the rise in unfriendly bots on the internet as well as DDoS botnets reaching 15 Tbps, I don’t think many people have much of a choice.

tete|3 months ago

> As if anybody could viably stop using them.

To be fair AWS (and GCP and Azure) at least is easy to replace with something else. And pretty much all alternatives are cheaper, less messy, etc. There are very few situations where you cannot viably do so.

We live in a world where you can get things like dedicated servers, etc. within similar time spans as creating a "compute engine" node on a big cloud provider.

The fact that cloud services added serious limitations to what applications were able to do (things like state management, passing configuration in more unified ways, etc.) means that running your own infrastructure is easier than ever, since your devs won't end up whining at you until you do something super custom just for some project to be a bit easier. But if you really want to you can.

GitHub also has become easy to get away from and indeed many individuals and companies did so.

CDNs are the bigger thing but A) there are a lot of other CDNs and B) having an image, or lets say an ansible config allows you to quickly deploy something that might be close enough for your use case. Just take any hosting company or even a dozen around the world.

Of course if you allowed yourself to end up in a complete vendor lock in things might be different, but if you think that it's a good idea to be completely dependent on the whims of some other company maybe you deserve that state. As in don't run a business without having any kind of fallback for decisions you make. Yes, profit from that big benefit something might give you, but don't lock the door behind you.

Sure you might be lucky and sure maybe you are fine going for luck while it lasts. Just don't be surprised when it all shatters.

philipallstar|3 months ago

> As if anybody could viably stop using them.

It is as easy to not use them as it ever was. There has been no actual centralisation. Everything is done using open protocols. I don't know what more you could want.

Compare it to Windows where there is deep volume discounting and salespeople shmoozing CTOs and getting in with schools, healthcare providers etc etc. That's actual lock-in.

markus_zhang|3 months ago

It’s too few and far between. It’s gonna make some changes if it’s a monthly event. If businesses start to lose connection for 8 hours every month, maybe the bigger ones are going to run for self hosting or at least some capacity of self hosting.

fragmede|3 months ago

> It obviously won't.

Here's where we separate the men from the boys, the women from the girls, the Enbys from the enbetts, and the SREs from the DevOps. If you went down when Cloudflare went do, do you go multicloud so that can't happen again, or do you shrug your shoulders and say "well, everyone else is down"? Have some pride in your work, do better, be better, and strive for greatness. Have backup plans for your backup plans, and get out of the pit of mediocrity.

Or not, shit's expensive and kubernetes is too complicated and "no one" needs that.

tcfhgj|3 months ago

> As if anybody could viably stop using them.

You can, and even save money.

sjamaan|3 months ago

Same with the big Crowdstrike fail of 2024. Especially when everyone kept repeating the laughable statement that these guys have their shit in order, so it couldn't possibly be a simple fuckup on their end. Guess what, they don't, and it was. And nobody has realized the importance of diversity for resilience, so all the major stuff is still running on Windows and using Crowdstrike.

notepad0x90|3 months ago

Does the author of this post not see the irony of posting this content on Github?

My counter argument is that "centralization" in a technical sense isn't about what company owns things but how services are operated. Cloudflare is very decentralized.

Furthermore, I've seen regional outages caused by things like anchors dropped by ships in the wrong place, a shark eating a cable. Regional power outages caused by squirrels,etc... outages happen.

If everyone ran their own server from their own home, AT&T or Level3 could have an outage and still take out similar swathes of the internet.

With CDNs like cloudflare, if Level3 had an outage, your website won't be down because your home or VPS host's upstream transit happens to be Level3 (or whatever they call themselves these days) because your content (at least static) is cached globally.

The only real reasonable alternative is something like ipfs, web3 and similar talk.

Cloudflare has always called itself a content transport provider, think of it as such. But also, Cloudflare is just one player, there are several very big players. Every big cloud provider has a competing product, not to mention companies like Akamai.

People are rage posting about cloudflare, especially because it has made CDNs accessible to everyone. You can easily setup a free cloudflare account and be on your merry way. This isn't something you should be angry about. You're free to pay for any number of other cdns, many do.

If you don't like how Cloudflare has so much market share, then come up with a similarly competitive alternative and profit. Just this HN thread alone is enough for me to think there is a market for more players. Or, just spread the word about the competition that exists today. Use frontdoor, cloudfront, netlify, flycdn, akamai,etc... It's hardly a monopoly.

jbreckmckye|3 months ago

As explained in other comments, I only used Gist because publishing to my own site failed. So GH is my redundancy :-)

miki123211|3 months ago

I don't know how many times I need to say this, but I will die on this hill.

Centralized services don't decrease redundancy. They're usually far more redundant than whatever homegrown solution you can come up with.

The difference between centralized and homegrown is mostly psychological. We notice the outages of centralized systems more often, as they affect everything at the same time instead of different systems at different times. This is true even if, in a hypothetical world with no centralization, we'd have more total outage time than we do now.

If your gas station says "closed" due to a problem that only affects their own networks, people usually go "aah they're probably doing repairs or something", and forget about the problem 5 minutes later. If there's a Cloudflare outage... everybody (rightly) blames the Cloudflare outage.

Where this becomes a problem is when correlated failures are actually worse than uncorrelated ones. If Visa goes down, it's better if Mastercard stays up, because many customers have both and can use the other when one doesn't work. In some ways, it's better to have 30 mins of Visa outages today and 30 mins of Mastercard outages tomorrow, than to have just 15 mins of correlated outages in one day.

lloeki|3 months ago

"redundancy" might not be there correct word. If we had a single worldwide mega-entity serving 100% of the internet it would be both a monopoly and would have tons of redundant infrastructure.

But it would also be quite unified; the system, while full of redundancies, as a whole is a unique one operated the same way end to end; by virtue of it being a single system handled in a uniform way, a single glitch could bring it all down. There is no diversity in the system's implementation, the monoculture itself makes it vulnerable.

freeplay|3 months ago

The problem is creating a single point of failure.

There's no doubt a VM in AWS is exponentially more redundant than my VM running on a couple of Intel NUCs in my closet.

The difference is, when I have a major outage, my blog goes down.

When EC2 has a major outage, all of the blogs go down. Along with Wikipedia, Starbucks, and half the internet.

That single point of failure is the issue.

dgan|3 months ago

> Centralized services don't decrease redundancy

Alright, but it creates a failure correlation where previously there was none

masfuerte|3 months ago

In my experience services aren't failing due to a lack of redundancy but due to an excess of complexity. With the move to the cloud we are continually increasing both redundancy and complexity and this is making the problem worse.

I have a cheap VPS that has run reliably for a decade except for a planned hour of downtime. Which was in the middle of the night when no-one cared. Amazon is more reliable in theory. My cheap VPS is more reliable in practice.

tjwebbnorfolk|3 months ago

Every HN comment seems to say the same thing: downtime is inexcusable and the centralization of these services is ruining the internet.

I still don't see the big deal. 12 hours of downtime once every couple years isn't the end of the world. So people can't log into their bank website for a few hours -- banks used to only be open for like 4 hours a day and somehow we all survived. Twitter is down? Oh what a tragedy. Customers get some refunds, Cloudflare fixes the issue, and people move on with life.

Cars still break down occasionally after 100+ years of engineering for reliability and safety. The power still goes out every now and then. Cook on the stove. The cost of making everything perfect all the time just isn't worth it.

I run my own servers on my own network and do not use Cloudflare. My stuff goes down too. And it's "decentralized" in the way you think the internet "should" be, which entails its own risks. So what do you all want, exactly? A public lashing of every developer at Cloudflare who pushes a bug to prod? A congressional investigation? I just don't understand the outrage here.

Stuff breaks occasionally. Get used to it, and design accordingly.

rigrassm|3 months ago

> So people can't log into their bank website for a few hours, banks used to only be open for like 4 hours a day and somehow we all survived.

1. I believe it's payment processing systems not functioning properly that causes real problems for people and not simply bank websites being down. Especially given...

2. Banks being closed so much back when cash/checks were actually widely used wasn't an issue because you could just pop over to an ATM or whip out a checkbook. In today's system, every single purchase you make requires communication between the merchant, your bank, and any number of middlemen via the internet.

Yeah, cash is still used today but I've been noticing even things like school sports events have stopped taking cash all together and simply post a QR code to buy from your phone.

That is unless the school has crap cell reception (with no public Wi-Fi either!), Cloudflare shits the bed, Visa thinks you're buying porn, you locked your debit card and now can't unlock it cuz the website is down, or any one of the million things that break all the time. Replace school sports event with literally every single things that requires a financial transaction and it's easy to see how even a short outage can lead to actual harm being realized.

joshuamcginnis|3 months ago

From a consumers perspective, that makes sense. From a business's perspective, downtime can mean significant loss of revenue or new business opportunity.

jcattle|3 months ago

"The Cloudflare outage was a good thing [...] they're a warning. They can force redundancy and resilience into systems."

- he says. On Github.

jbreckmckye|3 months ago

I published it as a Gist because my own blog deployment pipeline was in a non-functioning state.

chasing0entropy|3 months ago

Spot on article, but without a call to action. What can we do to combat the migration of society to a centralized corpro-government intertwined entity with no regard for unprofitable privacy or individualism?

adrianN|3 months ago

Individuals are unlikely to be able to do something about the centralization problem except vote for politicians that want to implement countermeasures. I don’t know of any politicians (with a chance to win anything) that have that on their agenda.

DANmode|3 months ago

Learn how to host anything, today.

card_zero|3 months ago

We could quibble about the premise.

rzerowan|3 months ago

So were going backwards to a world where there are basically 5 computers running everything and everyone is basically accessing the world through a dumb terminal.Even though the digital slab in our pockets has more compute than a roomful of the early gen devices. Hopefully critical infrashifts back to managed metal or private clouds - dont see it though with the last decades of cloud evangalism to move all legacy systems to the cloud doesnt look like reversing anytime soon.

fragmede|3 months ago

Yeah it's crazy to realize it takes a room of electronics for me to get my (g)mail. The more things change, the more they stay the same, eh?

zwnow|3 months ago

I agree considering all the Cloudflare AWS Azure apologists I see all around... Learning AWS already is the #1 tip on social media to "become employed as a dev in 2025 guaranteed" and I always just sigh when seeing this. I wouldnt touch it with a stick.

timenotwasted|3 months ago

"Embrace outages, and build redundancy." — It feels like back in the day this was championed pretty hard especially by places like Netflix (Chaos Monkey) but as downtime has become more expected it seems we are sliding backwards. I have a tendency to rely too much on feelings so I'm sure someone could point me to some data that proves otherwise but for now that's my read on things. Personally, I've been going a lot more in on self-hosting lots of things I used to just mindlessly leave on the cloud.

bcrl|3 months ago

I have cell phone calls regularly drop during tower handoffs, and codec errors that result in a blast of static upon answering a call. I can't remember a single time I had a phone call fail on the old PSTN built out of DMS10 and DMS100s locally (well, until we lost all trunks due to a fibre issue a couple of weeks ago on November 10th -- the incumbent didn't notice the outage which started at ~3:20am until ~9:30am, and it wasn't fixed until 17:38). One time when I was a teenager in the '90s, a friend and I had a 14 hour call using landlines.

The modern tech stack is disappointing in its lack of reliability. Complexity is the root of all evil.

rafaelcosta|3 months ago

I don't get why this applies on the Cloudflare outage but not on the AWS ones... I'd argue that the big cloud providers are WAY more impactful when they go down than Cloudflare. The only difference is that the average techie uses Cloudflare more and sees the impact more, but this point was already there before...

torginus|3 months ago

What happens if you don't use Cloudflare and just host everything on a server?

Can't you run a website like that if you don't host heavy content?

How common are DDOS attacks anyway, and aren't there local (to the server), that analyze user behavior to a decent accuracy (at least it can tell they're using a real browser and behaving more or less like a human would, making attacks expensive).

Can't you buy a list of ISP ranges from a GeoIP provider (you can), at least then you'd know which addresses belong to real humans.

I don't think botnets are that big of a problem (maybe in some obscure places of the world, but you can temp rangeban a certain IP range, if there's a lot of suspicious traffic coming from there).

If lots of legit networks (as in belonging to people who are paying an ISP for their network connections) have botnets, that's means most PCs are compromised, which is a much more severe issue.

justsomehnguy|3 months ago

> What happens if you don't use Cloudflare and just host everything on a server?

It works.

> Can't you run a website like that if you don't host heavy content?

Even with a heavy content - question is how many visitors do you have. If there is one once an hour you would suffice on a 100Mbit/Unlim connection.

> How common are DDOS attacks anyway

Extremely rare. 99% of sites never experience it, 1% do have some trouble because somebody nearby is being DDoS'ed.

> and aren't there local (to the server), that analyze user behavior to a decent accuracy (at least it can tell they're using a real browser and behaving more or less like a human would, making attacks expensive).

No point, you can't do anything anyway - it's a denial of service so there are gigabytes of trash flowing your way.

> Can't you buy a list of ISP ranges from a GeoIP provider (you can), at least then you'd know which addresses belong to real humans.

No point. If you are not being DDoS'ed then you just spent money and time (ie money) on useless preventive measure you never use. And when (if) it would come you can't do anything anyway, because it's a distributed denial of service attack.

> I don't think botnets are that big of a problem (maybe in some obscure places of the world, but you can temp rangeban a certain IP range, if there's a lot of suspicious traffic coming from there).

It's not a DDoS if you can filter at the endpoint.

dijit|3 months ago

Yeah, you can.

Lots of people use raspberry pi’s for this, which is a smidge anaemic for some decent load (HN Hug Of Death)- even an Intel N100 is more grunt, for context.

This makes people think that their self hosting setup can never handle HN load; because when they see people talking about self hosting the site goes down.

dewey|3 months ago

Botnets use real residential connections not just data centers. So your static list of “real people” doesn’t really make a difference.

bcrl|3 months ago

voip.ms was pretty much offline for a couple of weeks while under a lengthy DDoS attack. They were only able to restore service by putting all their servers behind Cloudflare proxies to mitigate the ongoing DDoS.

zie1ony|3 months ago

My friend wasn't able to do RTG during the outage. They had to use ultrasound machine on his broken arm to see inside.

Aurornis|3 months ago

> My friend wasn't able to do RTG during the outage.

What is RTG?

YmiYugy|3 months ago

It's worth considering the counter factual. Let's say there would be a few dozen semi popular DDoS services. Would that be better? Some assumptions: The services would be slightly less effective and also have worse downtimes. You could argue that Cloudflare is coasting on a monopoly and that competition would drive them to improve, but I'm pretty confident that DDoS protection it one of those things were having a large network to absorb attacks and a large team to monitor them if very valuable. I submit as evidence that Cloudflare has been doing well despite the 3 big cloud providers offering DDoS protection.

So what would be the result of a highly decentralized but slightly worse and less reliable DDoS protection? I'd argue that for a lot of things this wouldn't be an improvement. Cloudflare being so dominant means lot's of things go down simultaneously. But that only matters for fungible services, e.g. if a schools education portal goes down, it doesn't matter if all the other education portals are also down. There are cases where it matters like the tyre pumps. I'd argue that these devices have no reason to be reliant on an online connection to begin with. I think cloud services as a whole have massively improved the reliability of internet services. In almost all cases reducing the overall amount of outages is a higher priority than preventing outage correlations.

stroebs|3 months ago

The problem is far more nuanced than the internet simply becoming too centralised.

I want to host my gas station network’s air machine infrastructure, and I only want people in the US to be able to access it. That simple task is literally impossible with what we have allowed the internet to become.

FWIW I love Cloudflare’s products and make use of a large amount of them, but I can’t advocate for using them in my professional job since we actually require distributed infrastructure that won’t fail globally in random ways we can’t control.

Aurornis|3 months ago

> and I only want people in the US to be able to access it. That simple task is literally impossible with what we have allowed the internet to become.

Is anyone else as confused as I am about how common anti-openness and anti-freedom comments are becoming on HN? I don’t even understand what this comment wants: Banning VPNs? Walling off the rest of the world from US internet? Strict government identity and citizenship verification of people allowed to use the internet?

It’s weird to see these comments get traction after growing up in an internet where tech comments were relentlessly pro freedom and openness on the web. Now it seems like every day I open HN and there are calls to lock things down, shut down websites, institute age (and therefore identify) verification requirements. It’s all so foreign and it feels like the vibe shift happened overnight.

zrm|3 months ago

> I want to host my gas station network’s air machine infrastructure, and I only want people in the US to be able to access it. That simple task is literally impossible with what we have allowed the internet to become.

That task was never simple and is unrelated to Cloudflare or AWS. The internet at a fundamental level only knows where the next hop is, not where the source or destination is. And even if it did, it would only know where the machine is, not where the person writing the code that runs on the machine is.

Xelbair|3 months ago

Genuine question - why are you spending time and effort on geofencing when you could spend it on improving your software/service?

It takes time and effort for no gain in any sensible business goal. People outside of US won't need it, bad actors will spoof their location, and it might inconvenience your real customers.

And if you want a secure communication just setup zero-trust network.

asimovDev|3 months ago

not a sysadmin here. why wouldn't this be behind a VPN or some kind of whitelist where only confirmed IPs from the offices / gas stations have access to the infrastructure?

Fnoord|3 months ago

Literally impossible? On the contrary; Geofencing is easy. I block all kind of nefarious countries on my firewall, and I don't miss them (no loss not being able to connect to/from a mafia state like Russia). Now, if I were to block FAMAG... or Cloudflare...

Joel_Mckay|3 months ago

Client side SSL certificates with embedded user account identification are trivial, and work well for publicly exposed systems where IPsec or Dynamic frame sizes are problematic (corporate networks often mangle traffic.)

Accordingly, connections from unauthorized users is effectively restricted, but is also not necessarily pigeonholed to a single point of failure.

https://www.rabbitmq.com/docs/ssl

Best of luck =3

notepad0x90|3 months ago

Is Cloudflare having more outages than aws, gcp or azure? Honestly curious, I don't know the answer.

eddd-ddde|3 months ago

I absolutely hate companies thinking they are being smart by blocking foreign IPs from using their websites.

Every single time I want to order a burger from the local place, I have to use a VPN to fake being in the country (even though I actually am already physically here) so that it will let me give them my money.

My phone's plan is not from here, so my IP address is actually not geographically in the same place as me.

oidar|3 months ago

I wonder what would life without cloudflare look like? What practices would fill the gaps if a company didn't - or wasn't allowed to -- satisfy the the concerns that cloudflare fills.

immibis|3 months ago

Pretty much exactly like it does now but with less captchas.

Fun fact: Headless browsers can easily pass cloudflare captchas automatically. They're not actually captchaing - they're just a placebo. You just need to be coming from a residential IP address and using a real browser.

vasco|3 months ago

I'll die on the hill that centralization is more efficient than decentralization and that rare outages of hugely centralized systems that are otherwise highly reliable are much better than full decentralization with much worse reliability.

In other words, when AWS or Cloudflare go down it's catastrophic in the sense that everyone sees the issues at the same time, but smaller providers usually have much more ongoing issues, that just happen to be "chronic" vs "acute" pains.

GeneralMaximus|3 months ago

Efficient in terms of what, exactly?

There are multiple dimensions to this problem. Putting everything behind Cloudflare might give you better uptime, reliability, performance, etc. but it also has the effect of centralizing power into the hands of a single entity. Instead of twisting the arms of ten different CXOs, your local politician now only needs to twist the arm of a single CXO to knock your entire business off the internet.

I live in India, where the government has always been hostile to the ideals of freedom of speech and expression. Complete internet blackouts are common in several states, and major ISPs block websites without due process or an appeals mechanism. Nobody is safe from this, not even Github[1]. In countries like India, decentralization is a preventative measure.

[1] https://en.wikipedia.org/wiki/Censorship_of_GitHub#India

And I'm not even going to talk about abuse of monopoly power and all that. What happens when Cloudflare has their Apple moment? When they jack up their prices 10x, or refuse to serve customers that might use their CDNs to serve "inappropriate" content? When the definition of "inappropriate" is left fuzzy, so that it applies to everything from CSAM to political commentary?

No thanks.

Xelbair|3 months ago

>I'll die on hill that hyperoptimized systems are more efficient than anti-fragile.

Of course they are, the issue is what level of failure were going to accept.

torginus|3 months ago

And the irony is that people are pushing for decentralization like microservices and k8s - on centralized platforms like AWS.

throwaway81523|3 months ago

Now just wait til every country on earth really does replace most of its employees with ChatGPT... and then OpenAI's data center goes offline with a fiber cut or something. All work everywhere stops. Cloudflare outage is nothing compared to that.

delaminator|3 months ago

That was this outage. ChatGPT and Claude are both behind Clouflare’s bot detection. You couldn’t log into either Web frontends.

And the error message said you were blocking them. We had support tickets coming in demanding to know why ChatGPT was being blocked.

We also couldn’t log into our supplier’s B2B system to place our customer orders.

So all the advice of “just self host” is moot when you’re in a food web.

DeathArrow|3 months ago

That's why it's better to have redundancy. Hire Claude and Deepseek, too.

teiferer|3 months ago

> goes offline with a fiber cut

If a fiber cut brings your network down then you have fundamental network design issues and need to change hiring practices.

tomschwiha|3 months ago

For me personally I didn't notice the downtime in the first hour or so. When using some website assets were not loading, but that's it. Turnstile outage maybe impacted me most. Could be because I'm EU based and Cloudflare is not "so" widespread here as in other parts of the world.

0x073|3 months ago

The outage wasn’t a good thing, since nothing is changing as a result. (How many outages does cloud flare had?)

joeblubaugh|3 months ago

meta: why are we rewriting such anodyne titles? “was” -> “might be” undermines the author's point

tonyhart7|3 months ago

I don't like this argument since you can applied this argument to google,microsot,aws,facebook etc

Tech world is dominated by US company and what is alternative to most of these service???? its a lot fewer than you might think and even then you must make a compromise in certain areas

chris_wray|3 months ago

I really wish we could build a truly decentralized server platform.

SirMaster|3 months ago

If these systems are as important as they say, it's surprising to me that they are not built with backups and redundancies in place like other mission critical things are engineered and built with.

rldjbpin|3 months ago

feels like the main message is missed by seeing most of the discourse here:

> Outages like today's are a good thing because they're a warning. They can force redundancy and resilience into systems.

the advice is not to shun big companies and providers, but rather have a backup solution built-in for situations like this. switching solely to an in-house alternative is not always a great idea, but it can be a great backup solution.

mrasong|3 months ago

Yeah, when it went down, a bunch of the sites I use every day just stopped working.

That’s when I realized it’s basically one of the backbone pieces of the entire internet.

chrisjj|3 months ago

True title: The Cloudflare outage was a good thing

L-four|3 months ago

It's a tragedy of the commons. Even if you don't use Cloudflare does it matter if no one can pay for your products.

theideaofcoffee|3 months ago

> They [outages] can force redundancy and resilience into systems.

They won’t until either the monetary pain of outages becomes greater than the inefficiency of holding on to more systems to support that redundancy, or, government steps in with clear regulation forcing their hand. And I’m not sure about the latter. So I’m not holding my breath about anything changing. It will continue to be a circus of doing everything on a shoestring because line must go up every quarter or a shareholder doesn’t keep their wings.

morshu9001|3 months ago

That's ok though, not every website needs 5 9s

charcircuit|3 months ago

>It's ironic because the internet was actually designed for decentralisation, a system that governments could use to coordinate their response in the event of nuclear war

This is not true. The internet was never designed to withstand nuclear war.

chasing0entropy|3 months ago

Arpanet absolutely was designed to be a physically resilient network which could survive the loss of multiple physical switch locations.

anonym29|3 months ago

ARPANET was literally invented during the cold war for the specific and explicit purpose of networked communications resilience for government and military in the event major networking hubs went offline due to one or more successful nuclear attacks against the United States

bblb|3 months ago

Perhaps. Perhaps not. But it will survive it. It will survive a complete nuclear winter. It's too useful to die, and will be one the first things to be fixed after global annihilation.

But Internet is not hosting companies or cloud providers. Internet does not care if they don't build their systems resilient enough and let the SPOFs creep up. Internet does it's thing and the packets keep flowing. Maybe BGP and DNS could use some additional armoring but there are ways around both of them in case of actual emergency.

almosthere|3 months ago

how many people are still on us-east-1

mcny|3 months ago

My old employer used azure. It irritated me to no end when they said we must rename all our resources to match the convention of naming everything US East as "eu-" because (Eastern United States I guess)

A total clown show

ovo101|3 months ago

Outages like this highlight just how much of the internet’s resilience depends on a single provider. In a way, it’s a healthy reminder: if one company’s hiccup can take down half the web, maybe we’ve over‑centralized. A “good thing” only if it sparks more serious conversations about redundancy, multi‑provider strategies, and reducing monoculture risk. Otherwise, we’ll just keep repeating the same failure modes at larger scales.

0xbadcafebee|3 months ago

Centralization has nothing to do with the problems of society and technology. And if you think the internet is all controlled by just a couple companies, you don't actually understand how it works. The internet is wildly decentralized. Even Cloudflare is. It offers tons of services, all of which are completely optional and can be used individually. You can also stop using them at any time, and use any of their competitors (of which there are many).

If, on the off chance, people just get "addicted" to Cloudflare, and Cloudflare's now-obviously-terrible engineering causes society to become less reliable, then people will respond to that. Either competitors will pop up, or people will depend on them less, or governments will (finally!) impose some regulations around the operation of technical infrastructure.

We have actually too much freedom on the Internet. Companies are free to build internet systems any way they want - including in very unreliable ways - because we impose no regulations or standards requirements on them. Those people are then free to sell products to real people based on this shoddy design, with no penalty for the products falling apart. So far we haven't had any gigantic disasters (Great Chicago Fire, Triangle Shirtwaist Factory Fire, MGM Grand Hotel Fire), but we have had major disruptions.

We already dealt with this problem in the rest of society. Buildings have building codes, fire codes, electrical codes. They prescribe and require testing procedures, provide standard building methods to ensure strength in extreme weather, resist a spreading fire long enough to allow people to escape, etc. All measures to ensure the safety and reliability of the things we interact with and depend on. You can build anything you want - say, a preschool? - but you aren't allowed to build it in a shoddy manner. We have that for physical infrastructure; now we need it for virtual infrastructure. A software building code.

DeathArrow|3 months ago

Centralization means having a single point of failure for everything. If your government, mobile phone or car stops working, it doesn't mean all governments, all cars and all mobile phones stop working.

Centralization makes mass surveillance easier, makes selectively denying of service easier. Centralization also means that once someone hacks into the system, he gains access to all data, not just a part of it.

nicman23|3 months ago

i hate that i cannot just scrape things for me usage and i have to use things like camufox instead of curl

Surac|3 months ago

The thing I learned from the incident is that rust offer a unpack function. It puzzles me why the hell they build such a function in the first place.

aw1621107|3 months ago

> It puzzles me why the hell they build such a function in the first place.

One reason is similar to why most programming languages don't return an Option<T> when indexing into an array/vector/list/etc. There are always tradeoffs to make, especially when your strangeness budget is going to other things.