(no title)
anomaloustho | 3 months ago
As a company, you don’t want to declare an outage readily and you definitely don’t want it to be declared frequently. Declaring an outage frequently means:
• Telling your exec team that your department is not running well • Negative signal to your investors • Bad reputation with your customers • Admitting culpability to your customers and partners (inviting lawsuits and refunds) • Telling your engineering leadership team that your specific team isn’t running well • Messing up your quarterly goals, bonuses etcetera for outages that aren’t real
So every social and incentive structure along the way basically signals that you don’t want to declare an outage when it isn’t real. You want to make sure you get it right. Therefore, you don’t just want to flip a status page because a few API calls had a timeout.
FinnKuhn|3 months ago
I would argue that every social and incentive structure along the way basically signals that you don't want to declare an outage, even when it is real. You should still do it though or it becomes meaningless.
Great example for Goodhart's law.
gwbas1c|3 months ago
I've personally challenged some details in these policies, which I won't discuss publicly. What I generally agree with is that it's important to have a human in the loop, and to be very thoughtful about when to update a status page and what is put there.