top | item 46159287

(no title)

luastoned | 2 months ago

From the incident page:

A change made to how Cloudflare's Web Application Firewall parses requests caused Cloudflare's network to be unavailable for several minutes this morning. This was not an attack; the change was deployed by our team to help mitigate the industry-wide vulnerability disclosed this week in React Server Components. We will share more information as we have it today.

https://www.cloudflarestatus.com/incidents/lfrm31y6sw9q

discuss

order

reassess_blind|2 months ago

I’m really curious what their rollout procedure is, because it seems like many of their past outages should have been uncovered if they released these configuration changes to 1% of global traffic first.

lima|2 months ago

They don't appear to have a rollout procedure for some of their globally replicated application state. They had a number of major outages over the past years which all had the same root cause of "a global config change exposed a bug in our code and everything blew up".

I guess it's an organizational consequence of mitigating attacks in real time, where rollout delays can be risky as well. But if you're going to do that, it would appear that the code has to be written much more defensively than what they're doing it right now.

stogot|2 months ago

The update they describe should never bring down all services. I agree with other posters that they must lack a rollout strategy yet they sent spam emails mocking the reliability of other clouds

Traubenfuchs|2 months ago

"Please don‘t block the rollout pipleline with a simple react security patch update."

philipwhiuk|2 months ago

So their parser broke again I guess.

And no staged rollout I assume?

tialaramex|2 months ago

Apparently somehow this had never been how Cloudflare did this. I expressed incredulity about this to one of their employees, but yeah, seems like their attitude was "We never make mistakes so it's fastest to just deploy every change across the entire system immediately" and as we've seen repeatedly in the past short while that means it sometimes blows up.

They have blameless post mortems, but maybe "We actually do make mistakes so this practice is not good" wasn't a lesson anybody wanted to hear.

o_m|2 months ago

I wonder if this is the new normal? Weekly Cloudflare outages that breaks huge parts of the internet.

meindnoch|2 months ago

React (a frontend JS framework) can now bring down critical Internet infrastructure.

I will repeat it because it's so surreal: React (a frontend JS framework) can now bring down critical Internet infrastructure.

cryptonym|2 months ago

That's Next.js, not React.

Mentioning React Server Components in the status page can be seen as a bad way to shift the blame. Would have been better to not specify which CVE they were trying to patch. The issue is their rollout management, not the Vendor and CVE.

spiffytech|2 months ago

To be accurate: React developed server-side capabilities, and that's where the vulnerability exists.

It's feels noteworthy because React started out frontend-only, but pedantically it's just another backend with a vulnerability.

uyzstvqs|2 months ago

Ah yes, Cloudflare's worst enemy: The configuration change.

hinkley|2 months ago

On fridays, yes.

aatd86|2 months ago

so it's react again in the end .. zzzzzzz

pepoluan|2 months ago

So. Another regex problem?