This article is quite dated. I ran the Statistics for Engineers class at various conferences over the years, and updated the material. I literally just did a session at SRECon EMEA today [1]!
Thanks for this! Any chance you'll be running this class in a way I can attend any time in the near future in London? (I could be convinced to fly somewhere else in Europe for a couple days though)
Looks like a good overview but I'm surprised it didn't cover process control charts, since they are easy to make and specifically for detecting the situation where the world has changed in a meaningful way (e.g., a machine is malfunctioning). The book "Understanding Variation: The key to managing chaos" by Donald Wheeler has a cult following and exalts control charts in the business and manufacturing world. Parts of it are a little goofy but I do think it does a good job making a simple tool useful for practitioners.
I knew someone who loved that book and taught corporate workers to throw literally all data into control charts. For instance, instead of doing a t-test, just string out the data in order of the classes and see if the points go outside the lines. I thought it was lazy, but if you're going to have one tool then I guess you could do worse than the control chart.
Interesting pointer. I had looked at the Control Theory literature. We used the CUSUM "control chart" as an internal bit in Anomaly Detection method, we built a while ago. This had limited success. Usually the data is just too noisy. Industrial sensors tend to be much better behaved than stuff tha you get from the Linux kernel.
Are you aware of people in the IT-Ops domain who use control charts?
Yes, in the cases where statical outliers can actually represent machinery failure, it's important that the 'roll ups' mentioned in the article don't hide real underlying problems.
There's also quite a few other charting techniques that financiers have been using for decades, such as ohlc/bar/candlestick or point & figure or market profile which all have their place in data visualisations. Combine that with financial charting models (ma, stochastics, etc) can go a long way in determining when things are going great/ pear shaped.
+1 to this. This couldn't have come at a better time as I transition into more SRE style work and have been tasked with not only creating some dashboards but relevant ones
[+] [-] heinrichhartman|6 years ago|reply
This article is quite dated. I ran the Statistics for Engineers class at various conferences over the years, and updated the material. I literally just did a session at SRECon EMEA today [1]!
The course material is here: https://github.com/HeinrichHartmann/Statistics-for-Engineers...
Todays version includes new material about:
- How averaging percentiles breaks down
- How sub-sampling affects percentile calculations
- Comparison of "mergeable aggregation methods" like HDR Histograms, t-digest, etc.
If you liked the article, make sure to check out the github course material. It's much broader and more up-to-date.
[1] https://www.usenix.org/conference/srecon19emea/presentation/...
[+] [-] heinrichhartman|6 years ago|reply
If you are looking for a monitoring vendor, who deeply cares about getting the the statistics right (especially around aggregating and analysing latency data), have a look at https://circonus.com / https://lps.circonus.com/statistics-for-engineers/ and/or reach out to me.
[+] [-] cassianoleal|6 years ago|reply
[+] [-] benogorek|6 years ago|reply
I knew someone who loved that book and taught corporate workers to throw literally all data into control charts. For instance, instead of doing a t-test, just string out the data in order of the classes and see if the points go outside the lines. I thought it was lazy, but if you're going to have one tool then I guess you could do worse than the control chart.
[+] [-] heinrichhartman|6 years ago|reply
Are you aware of people in the IT-Ops domain who use control charts?
[+] [-] cmroanirgo|6 years ago|reply
There's also quite a few other charting techniques that financiers have been using for decades, such as ohlc/bar/candlestick or point & figure or market profile which all have their place in data visualisations. Combine that with financial charting models (ma, stochastics, etc) can go a long way in determining when things are going great/ pear shaped.
But otherwise a decent article.
[+] [-] BOOSTERHIDROGEN|6 years ago|reply
[+] [-] gigatexal|6 years ago|reply