top | item 45974353

(no title)

RagingCactus | 3 months ago

Lots of people here are (perhaps rightfully) pointing to the unwrap() call being an issue. That might be true, but to me the fact that a reasonably "clean" panic at a defined line of code was not quickly picked up in any error monitoring system sounds just as important to investigate.

Assuming something similar to Sentry would be in use, it should clearly pick up the many process crashes that start occurring right as the downtime starts. And the well defined clean crashes should in theory also stand out against all the random errors that start occuring all over the system as it begins to go down, precisely because it's always failing at the exact same point.

discuss

order

rixed|3 months ago

Exactly! You could have `rand() > 0.5 && panic!()` in the code of your bot module, and that should not put the internet on fire.

The issue here is about the system as a whole not any line of code.

frumplestlatz|3 months ago

> The issue here is about the system as a whole not any line of code.

Unsoundness in the type system that leads to a systemic failure is about the system as a whole.

Not everything can be recovered from restarting a process, and process correctness and recovery is something that also derives from your type system.