top | item 5945855

Are We Ready to Kill Thresholds?

26 points| obfuscurity_ | 12 years ago |obfuscurity.com

3 comments

order

Pewpewarrows|12 years ago

Forgive me if this is a dumb comment to make, as I'm just barely starting to get into monitoring and the statistics knowledge that goes along with it, but adaptive fault detection does tend to scare me a bit. In the event that a problem isn't a spike, and instead gradually builds up over hours/days/weeks, I wouldn't be confident in something picking a dynamic threshold for me. I'd be afraid of it deeming the ever-rising resource usage as normal behavior, if it happens slow enough, and me not being alerted before it's too late (servers becoming unresponsive).

obfuscurity_|12 years ago

That's not at all a dumb comment. As I alluded to in the post, I think it's important that we understand how these systems determine what is - or isn't - an abnormality or fault. Unfortunately, that often means revealing their "secret sauce" and risk exposing their product differentiation. It's going to be interesting to see how these products earn our trust.