top | item 34501797

(no title)

wardb | 3 years ago

Unfortunately it's severely misunderstood in the benchmark how Grafana Loki should be queried for high cardinality data. See also https://github.com/SigNoz/logs-benchmark/issues/1

discuss

pranay01|3 years ago

Thanks for creating the issue. Yeah, this is what we also found, that Loki is not designed for querying high cardinality data.

But since Loki is many times used in observability use cases, where there is sometimes a need to query high cardinality data, we thought to include it.

wardb|3 years ago

That's incorrect, Loki is designed for querying high cardinality data.

The difference is that in Loki the index is only used for metadata around the source of the log lines (environment, team, cluster, host, pod etc) for selecting the right log stream to search in.

Parsing, aggregation and/or filtering of log lines on high cardinality data is all done at query time using LogQL. See also https://www.youtube.com/watch?v=UiiZ463lcVA and this live example where a 95th quantile is calculated using the request_time field of nginx logs https://play.grafana.org/d/T512JVH7z/loki-nginx-service-mesh...

wstuartcl|3 years ago

This is kind of the issue with an interested party/vendor running benchmarks like these. Be it by pure dumb luck or malfeasance you are much more likely to configure and be knowledgeable about your own product than the others and toss out responses and results that are wildly inaccurate/misleading.