(no title)
zzyzxd | 8 months ago
- Their alerts were not durable. The outage took out the alert system so humans were just eyeballing dashboards during the outage. What if your critical system went down along with that alert system, in the middle of night?
- The cloud marketplace service was affected by cloudflare outage and there's nothiing they could do.
- Tiered stroage was down, disk usage went above normal level. But there's no anomaly detection and no alerts. It survived because t0 storage was massively over provisioned.
- They took pride in using industry well-known designs like cell-based architecture, redundancy, multi-az...ChatGPT would be able to give me a better list
And I don't get whey they had to roast Crowdstrike at the end. I mean, the Crowdstrike incident was really amateur stuff, like, the absolute lowest bar I can think of.
No comments yet.