top | item 46817029

(no title)

jpollock | 1 month ago

Measurement and alerting is usually done in business metrics, not the causes. That way you catch classes of problems.

Not sure about expected loss, that's a decay rate?

But stuck jobs are via tasks being processed and average latency.

discuss

order

No comments yet.