Thanks for creating the issue. Yeah, this is what we also found, that Loki is not designed for querying high cardinality data.
But since Loki is many times used in observability use cases, where there is sometimes a need to query high cardinality data, we thought to include it.
That's incorrect, Loki is designed for querying high cardinality data.
The difference is that in Loki the index is only used for metadata around the source of the log lines (environment, team, cluster, host, pod etc) for selecting the right log stream to search in.
This is kind of the issue with an interested party/vendor running benchmarks like these. Be it by pure dumb luck or malfeasance you are much more likely to configure and be knowledgeable about your own product than the others and toss out responses and results that are wildly inaccurate/misleading.
pranay01|3 years ago
But since Loki is many times used in observability use cases, where there is sometimes a need to query high cardinality data, we thought to include it.
wardb|3 years ago
The difference is that in Loki the index is only used for metadata around the source of the log lines (environment, team, cluster, host, pod etc) for selecting the right log stream to search in.
Parsing, aggregation and/or filtering of log lines on high cardinality data is all done at query time using LogQL. See also https://www.youtube.com/watch?v=UiiZ463lcVA and this live example where a 95th quantile is calculated using the request_time field of nginx logs https://play.grafana.org/d/T512JVH7z/loki-nginx-service-mesh...
wstuartcl|3 years ago