top | item 34761620

(no title)

FrenchTouch42 | 3 years ago

Can you share more information about the schema you're mentioning? Thank you!

discuss

order

ignoramous|3 years ago

Not OP but they might be referring to Uber moving from ES to ClickHouse to store their schema-flexible, structured logs, mostly to improve ingestion performance: https://archive.is/bFsTF / https://www.uber.com/blog/logging/

The gist of it is:

- Structured logs (json) are stored as kv pairs in parallel arrays, along side metadata (host, timestamp, id, geo, namespace, etc).

- Log fields (ie kv pairs) are materialized (indexed) depending on query patterns, and vaccummed up if unused.

- Authoring queries and Kibana dashboard support is not trivial but handled with a query translation layer.

atombender|3 years ago

What do you mean by parallel arrays here?

Do you mean something like two arrays [k1, ..., kN] and [v1, ..., vN] in two different columns?

Is there a way in Clickhouse to filter such a pair of arrays such that you can do a search akin to vals[indexOfKey("foo")] == "bar"?