top | item 45978525

(no title)

polack | 3 months ago

What is not realistic? To do simple input validation on data that has the potential to break 20% of the internet? To not have a system in place to rollback to the latest known state when things crash?

Cloudflare builds a global scale system, not an iphone app. Please act like it.

discuss

order

raxxorraxor|3 months ago

Cloudflares success was simplicity to build a distributed system in different data centers around the world to be implemented by third party IT workers while Cloudflare were a few people. There are probably a lot of shitty iPhone apps that do less important work and are vastly more complex than the former Cloudflare server node configuration.

Every system has a non-reducible risk and no data rollback is trivial, especially for a CDN.

aquariusDue|3 months ago

Yeah, I don't quite understand the people cutting Cloudflare massive slack. It's not about nailing blame on a single person or a team, it's about keeping a company that is THE closest thing to a public utility for the web accountable. They more or less did a Press Release with a call to action to buy or use their services at the end and everybody is going "Yep, that's totally fine. Who hasn't sent a bug to prod, amirite?".

It goes over my head why Cloudflare is HN's darling while others like Google, Microsoft and AWS don't usually enjoy the same treatment.

miyuru|3 months ago

>It goes over my head why Cloudflare is HN's darling while others like Google, Microsoft and AWS don't usually enjoy the same treatment.

Do the others you mentioned provide such detailed outage reports, within 24 hours of an incident? I’ve never seen others share the actual code that related to the incident.

Or the CEO or CTO replying to comments here?

>Press Release

This is not press release, they always did these outage posts from the start of the company.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

dspillett|3 months ago

> To do simple input validation on data that has the potential to break 20% of the internet?

There will always be bugs in code, even simple code, and sometimes those things don't get caught before they cause significant trouble.

The failing here was not having a quick rollback option, or having it and not hitting the button soon enough (even if they thought the problem was probably something else, I think my paranoia about my own code quality is such that I would have been rolling back much sooner just in case I was wrong about the “something else”).