top | item 46163343

(no title)

dkyc | 2 months ago

One thing to keep in mind when judging what's 'appropriate' is that Cloudflare was effectively responding to an ongoing security incident outside of their control (the React Server RCE vulnerability). Part of Cloudlfare's value proposition is being quick to react to such threats. That changes the equation a bit: any hour you wait longer to deploy, your customers are actively getting hacked through a known high-severity vulnerability.

In this case it's not just a matter of 'hold back for another day to make sure it's done right', like when adding a new feature to a normal SaaS application. In Cloudflare's case moving slower also comes with a real cost.

That isn't to say it didn't work out badly this time, just that the calculation is a bit different.

discuss

flaminHotSpeedo|2 months ago

To clarify, I'm not trying to imply that I definitely wouldn't have made the same decision, or that cowboy decisions aren't ever the right call.

However, this preliminary report doesn't really justify the decision to use the same deployment system responsible for the 11/18 outage. Deployment safety should have been the focus of this report, not the technical details. My question that I want answered isn't "are there bugs in Cloudflare's systems" it's "has Cloudflare learned from it's recent mistakes to respond appropriately to events"

vlovich123|2 months ago

> doesn't really justify the decision to use the same deployment system responsible for the 11/18 outage

There’s no other deployment system available. There’s a single system for config deployment and it’s all that was available as they haven’t yet done the progressive roll out implementation yet.

dkyc|2 months ago

The 11/18 outage was 2.5 weeks ago. Any learning & changes they made as a result for that probably didn't make its way yet to production.

Particularly if we're asking them to be careful & deliberate about deployments, hard to ask them fast-track this.

Already__Taken|2 months ago

the cve isn't a zero day though how come cloudflare werent at the table for early disclosure?

flaminHotSpeedo|2 months ago

Do you have a public source about an embargo period for this one? I wasn't able to find one

ascorbic|2 months ago

Cloudflare did have early access, and had mitigation in place from the start. The changes that were being rolled out were in response to ongoing attempts to bypass those.

Disclosure: I work at Cloudflare, but not on the WAF

cowsandmilk|2 months ago

Cloudflare had already decided this was a rule that could be rolled out using their gradual deployment system. They did not view it as being so urgent that it required immediate global roll out.

udev4096|2 months ago

[deleted]

toomuchtodo|2 months ago

Indeed, but it is what it is. Cloudflare comes out of my budget, and even with downtime, its better than not paying them. Do I want to deal with what Cloudflare offers? I do not, I have higher value work to focus on. I want to pay someone else to deal with this, and just like when cloud providers are down, it'll be back up eventually. Grab a coffee or beer and hang; we aren't savings lives, we're just building websites. This is not laziness or nihilism, but simply being rational and pragmatic.