top | item 41720084

Alert Evaluations: Incremental Merges in ClickHouse

38 points| vadman97 | 1 year ago |highlight.io

7 comments

order

hodgesrm|1 year ago

It sounds as if you used your own algorithm for pre-aggregation into the schema you showed. Did you consider using a materialized view to populate it?

vadman97|1 year ago

We insert from our alerts worker because we want the aggregation to happen per alert (with the aggregated data filtered by the particular alert definition). As each alert is evaluated, we run the following [1] INSERT INTO ... SELECT ... statement based on the alert definition. We can't aggregate with an MV since we'd need to create an MV per unique alert that a customer may set up.

[1]: https://github.com/highlight/highlight/blob/c526daea31fdf764...

nitinreddy88|1 year ago

Precisely, whats stopping them from using simple Incremental Materialised view?

iampims|1 year ago

At a certain scale, exact computations (p50 for instance) become impractical. I’ve had great luck switching to approximate calculations with guaranteed error bounds.

An approachable paper on the topic is "Effective Computation of Biased Quantiles over Data Streams" http://dimacs.rutgers.edu/%7Egraham/pubs/papers/bquant-icde....

hipadev23|1 year ago

The quantile() method in clickhouse is also approximate although it uses a more simplistic reservoir sampling model than GK, but quantileGK() is also available. quantileExact() exists but indeed becomes impractical as you point out.