Time-Series Anomaly Detection: A Decade Review

[+] bluechair|1 year ago|reply

Didn’t see it mentioned but good to know about: UCR matrix profile.

The Matrix Profile is honestly one of the most underrated tools in the time series analysis space - it's ridiculously efficient. The killer feature is how it just works for finding motifs and anomalies without having to mess around with window sizes and thresholds like you do with traditional techniques. Solid across domains too, from manufacturing sensor data to ECG analysis to earthquake detection.

https://www.cs.ucr.edu/~eamonn/MatrixProfile.html

[+] jmpeax|1 year ago|reply

What do you mean you don't have to mess around with window sizes? Matrix profile is highly dependent on the window size.

[+] eamonnkeogh|1 year ago|reply

Thank you for your kind words ;-)

[+] Croftengea|1 year ago|reply

MP is one of the best univariate methods, but it's actually mentioned in the article.

[+] sriram_malhar|1 year ago|reply

Thanks for sharing; I am most intrigued by the sales pitch. But the website is downright ugly.

This is a better presentation by the same folks. https://matrixprofile.org/

[+] windsignaling|1 year ago|reply

Covered in section 6.2.1.

[+] bee_rider|1 year ago|reply

What does it do? Anything to do with matrices, like, from math?

[+] quijoteuniv|1 year ago|reply

I use offset function in Prometheus to make an average of past weeks as a recording rule. We have a use in our systems that is very "seasonal" as in weekly cycles so I make an average of some metric (offset 1 week, 2 week, 3 week , 4 week/4) and I compare it to the current value of that metric. That way the alarms can be set day or night, weekday or weekend, and the thresholds are dynamic. It compares against an average of the day of the week, or time of the day. There is someone in Gitlab that posted a more in depth explanation of this way of working. https://about.gitlab.com/blog/2019/07/23/anomaly-detection-u... Things get a bit more complicated with holidays, but you can actually programm them into prometheus https://promcon.io/2019-munich/slides/improved-alerting-with...

[+] gr3ml1n|1 year ago|reply

Whenever I have a chart in Grafana that isn't too dense, I almost always add a line for the 7d offset value. Super useful to tell what's normal and what isn't.

[+] CubsFan1060|1 year ago|reply

Gitlab also has this: https://gitlab.com/gitlab-com/gl-infra/tamland

I'm not really smart in these areas, but it feels like forecasting and anomaly detection are pretty related. I could be wrong though.

[+] mikehollinger|1 year ago|reply

This doesn’t capture work that’s happened in the last year or so.

For example some former colleagues timeseries foundation model (Granite TS) which was doing pretty well when we were experimenting with it. [1]

An aha moment for me was realizing that the way you can think of anomaly models working is that they’re effectively forecasting the next N steps, and then noticing when the actual measured values are “different enough” from the expected. This is simple to draw on a whiteboard for one signal but when it’s multi variate, pretty neat that it works.

[1] https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1

[+] 0cf8612b2e1e|1 year ago|reply

My similar recognition was when I read about isolation forests for outlier detection[0]. When predictions are different from the average, something is off.

[0] https://scikit-learn.org/stable/modules/generated/sklearn.en...

[+] tessierashpool9|1 year ago|reply

what were you thinking then before your aha moment? :D

[+] apwheele|1 year ago|reply

Care to share the contexts in which someone needs a zero-shot model for time series? I have just never come across one in which you don't have some historical data to fit a model and go from there.

[+] Dowwie|1 year ago|reply

In the nascent world of water tech are IOT devices that monitor water flow. These devices can detect leaks and estimate fixture-level water consumption. Leak detection is all about identifying time series outliers. The distribution-based anomaly detection mentioned in the paper is relevant for leak detection. Interestingly, a residence may require multiple distributions due to pipe temperature variations between warm and cold seasons.

[+] zaporozhets|1 year ago|reply

I recently tried to homebrew some anomaly detection work for a performance tracking project and was surprised at the absence of any off-the-shelf OSS or Paid solutions in this space (that weren’t super basic or way too complex). Lots of fertile ground here!

[+] rad_gruchalski|1 year ago|reply

There's a ton of material related to anomaly detection with Prometheus and Grafana stack: https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to.... But maybe this is the "way too complex" case you mention.

[+] jeffbee|1 year ago|reply

The reason there are not off-the-shelf solutions is this is an unsolved problem. There is no approach that is generally useful.

[+] CubsFan1060|1 year ago|reply

I'm still playing around with this one: https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to... (there's a github repo for it).

So far, it's not terrible, but has some pretty big flaws.

[+] ramon156|1 year ago|reply

I needed a TS anomaly detection for my internship because we needed to track when a machine/server was doing poorly or had unplanned downtime. I expected Microsoft's C# library to be able to do this, but my god, it's a mess. If someone has the time and will to implement a proper library then that would ve awesome.

[+] phirschybar|1 year ago|reply

agreed. at my company we ended up rolling our own system. but this area is absolutely ripe for some configurable saas or OS tool with advanced reporting and alerting mechanisms. Datadog has a decent offering, but it's pretty $$$$.

[+] hackernewds|1 year ago|reply

there's always prophet. forecast the next value and look at the difference

[+] montereynack|1 year ago|reply

Gonna throw in my hat here, time series anomaly detection for industrial machinery is the problem my startup is working on! Specifically we’re making it work offline-by-default (we integrate the AI with the equipment, and don’t send data to any third party servers - even ours) because we feel there’s a ton of customer opportunities that get left in the dust because they can’t be online. If you or someone you know is looking for a monitoring solution for industrial machinery, or are passionate about security-conscious industrial software (we also are developing a data historian) let’s talk! www.sentineldevices.com

[+] jorl17|1 year ago|reply

I have a soft spot for this area. Almost 10 years ago, my Masters touched on something somewhat adjacent to this (Online Failure Prediction): https://estudogeral.uc.pt/handle/10316/99218

We built a system to detect exceptions before they happened, and act on them, hoping that this would be better than letting them happen (e.g. preemptively slow down the rate of requests instead of leading to database exhaustion)

At the time, I felt that there was soooooooo much to do in the area, and I'm kinda sad I never worked on it again.

[+] djoldman|1 year ago|reply

> Unfortunately, inherent complexities in the data generation of these processes, combined with imperfections in the measurement systems as well as interactions with malicious actors, often result in abnormal phenomena. Such abnormal events appear subsequently in the collected data as anomalies.

This is critical; and difficult to deal with in many instances.

> With the term anomalies we refer to data points or groups of data points that do not conform to some notion of normality or an expected behavior based on previously observed data.

This is a key problem or perhaps the problem: rigorously or precisely defining what an anomaly is and is not.

[+] hazrmard|1 year ago|reply

Anomaly detection (AD) can arguably be a value-add to any industry. It may not be a core product, but AD can help optimize operations for almost anyone.

* Manufacturing: Computer vision to pick anomalies off the assembly line.

* Operation: Accelerometers/temperature sensors w/ frequency analysis to detect onset of faults (prognostics / diagnostics) and do predictive maintenance.

* Sales: Timeseries analyses on numbers / support calls to detect up/downticks in cashflows, customer satisfaction etc.

[+] Imanari|1 year ago|reply

Look up Eamonn Keogh, he has lots of interesting work on TSAD.

[+] ivoflipse|1 year ago|reply

His Google Tech Talk made me really appreciate his groups work, even though I have no need for time series analysis

https://youtu.be/vzPgHF7gcUQ?si=rKQvOjK_qjiSSvKE

[+] itissid|1 year ago|reply

Can someone explain to me how are SVMs are being classified in this paper as "Distribution-Based"? This is quite confusing as a taxonomy. They generaly don't estimate model free densities(kernel density estimates) or model based(separating one or more possibly overlapping normal distributions).

I get that they could be explicitly modeling a data generating process's probabilty itself(just like a NN) like of a Bernoulli(whose ML function is X-Entropy) or a Normal(ML function Mean Square loss), but I don't think that is what the author meant by a Distribution .

My understandin is that they don't make distributional assumption on the random variable(your Y or X) they are trying to find a max margin for.

[+] mlepath|1 year ago|reply

The process-centric taxonomy in this paper is one of the most structured frameworks I’ve seen for anomaly detection methods. It breaks down approaches into distance-based, density-based, and prediction-based categories. In practice (been doing time series analysis professionally for 8+ years), I’ve found that prediction-based methods (e.g., reconstruction errors in autoencoders) are fantastic for semi-supervised use cases but fall short for streaming data.

[+] leeoniya|1 year ago|reply

a colleague is doing a FOSDEM 2025 talk about https://github.com/grafana/augurs

[+] eskaytwo|1 year ago|reply

Thanks for the pointer. Augurs looks really promising. If the matrix profile method was included it would be a nice alternative to the numba jit methods that are commonplace.

[+] mathewshen|1 year ago|reply

Very surprised that I can see this paper here and it deserved! I start fellow the work of Dr. Boniol since 2021(By the series2graph paper). The Series2Graph is an very good algorithm that works well in some complex situations. And his later works like New Trends in Time-Series Anomaly Detection, TSB-UAD, Theseus and k-Graph and so on are very insightful too.

[+] mathewshen|1 year ago|reply

If you want to see more algorithms/systems that are used in industry company like Twitter/Microsoft/Amazon/LinkedIn/IBM/..., you can see my note here(The source page is in Chinese, and I just translate it into English using Google Translate): https://datahonor-com.translate.goog/odyssey/aiops/tsad/pape...

[+] countzro|1 year ago|reply

I also liked the main idea of Series2Graph but found the implementation a bit complicated.

There is a similar algorithm with a simpler implementation in this paper: „GraphTS: Graph-represented time series for subsequence anomaly detection“ https://pmc.ncbi.nlm.nih.gov/articles/PMC10431630/

The approach is for univariate time series and I found it to perform well (with very minor tweaks).

[+] whatever1|1 year ago|reply

Sometimes HN just reads my mind. This was exactly the topic I was looking into this week.

[+] brainwipe|1 year ago|reply

Wonderful! My PhD was in stream anomaly detection using dynamic neural networks in 2003. Can't wait to go deep through this paper and find out what the latest thinking is. Thanks, OP.

[+] lmc|1 year ago|reply

It would be useful to see some discussion of sampling regularity, e.g., whether some of these methods can be used with unevenly spaced time series. I work with satellite image time series, and clouds mean my usually-weekly series can sometimes be missing points for months. We often employ interpolation, but that can be a major source of error.

[+] unknown|1 year ago|reply

[deleted]

[+] eth0up|1 year ago|reply

I had not known of Time Series (or most other) anomaly detection methods until recently, when I used several LLMs to assist with an analysis of the Florida Lottery Pick4 history.

For years, I'd been casually observing the daily numbers (2 draws daily for each since around ?/?/2004?, and 1 prior), which are Pick2, 3, 4, and 5, but mostly Pick4, which is 4 digits, thus has 1:10,000 odds, vs 1:100, 1:1000 and 1:100,000 for the others.

With truly random numbers, it is pretty difficult to identify anything but glaring anomalies. Among some of the tests performed were: (clusters\daily\weekly; (isolated forest; (popular permutations\by date\special holidays\etc; (individual digits\deviations; (temporal frequency; (dbscan; (zscore; (patterns; (correlation; (external factors; (auto correlation by cluster; (predictive modeling; (chi squared; (Time Series ... and a few more I've forgotten.

For those wondering why I'd do this, around 2023-23, the FL Lottery drastically modified their website. Previously, one could enter a number for the game of their choice and receive all historical permutations of that number over all years, going back to the 1990s. With the new modification, the permutations have been eliminated and the history only shows for 2 years. The only option for the complete history is to download the provided PDF -- however, it is full of extraneous characters and cannot be readily searched via Ctrl-F, etc. Processing this PDF involves extensive character removal to render it parsable or modestly readable. So to restore the previously functional search ability, manual work is required. The seemingly deliberate obfuscation, or obstruction, was my motivation. The perceived anomalies over the years were secondary, as I am capable of little more than speculation without proper testing. But those two factors intrigued me.

Having no background in math and only feeble abilities in programming, this was a task that I could not have performed without LLMs and python code used for the tests. The test is still incomplete, having increased in complexity as I progressed and left me too tired to persist. The result were ultimately within acceptable ranges of randomness, but some patterns were present. I had made files of (all numbers that ever occurred; (all numbers that have never occurred; (popularity of single, isolated digits -- I was actually correct in my intuition here, which proved certain single were occurring with lesser or greater frequencies as would be expected; (a script to apply Optical Character Recognition on the website and append the latest results to a living text and PDF file to offer anyone interested an opportunity to freely search, parse and analyze the numbers. But I couldn't quite wangle the OCR successfully.

Working with a set over 60k individual number sets, looking for anomalies over a 30 year period; if there are other methods anyone would suggest, please offer them and I might resume this abandoned project.

[+] khafra|1 year ago|reply

Traditional anomaly detection is unlikely to find the signature of a bad psuedo-random number generator. You probably want something more like the NIST randomness test suite, the pyEntropy library, or mathy stuff like linear congruential generator analysis or testing for specific mersenne twisters.

[+] petemir|1 year ago|reply

Maybe you can use tabula [0] to extract the information from the PDF?

https://github.com/tabulapdf/tabula

[+] ukuina|1 year ago|reply

You could use a Visual LLM to transcribe the PDF back into JSON data for you.

Something like: ghostpdf to convert PDF into images, then gpt-4o or ollama+Llama3 to transcribe each image into output JSON.

[+] sigma33|1 year ago|reply

Could just use the API the web page uses and parse the JSON

https://apim-website-prod-eastus.azure-api.net/drawgamesapp/...

Gets you pick 4 for the 6 Jan, easy to parse.

.... "FireballPayouts": 18060.5, "DrawNumbers": [ { "NumberPick": 3, "NumberType": "wn1" }, { "NumberPick": 0, "NumberType": "wn2" }, { "NumberPick": 0, "NumberType": "wn3" }, { "NumberPick": 4, "NumberType": "wn4" }, { "NumberPick": 1, "NumberType": "fb" } ...

[+] lebotte|1 year ago|reply

Time-series anomaly detection involves using techniques like forecasting and historical data offsets to dynamically identify deviations in patterns, as discussed in practical applications with tools like Prometheus and Grafana.

[+] conjectures|1 year ago|reply

Hi Le Bot, say potato?

80 comments