top | item 34761620 (no title) FrenchTouch42 | 3 years ago Can you share more information about the schema you're mentioning? Thank you! discuss order hn newest ignoramous|3 years ago Not OP but they might be referring to Uber moving from ES to ClickHouse to store their schema-flexible, structured logs, mostly to improve ingestion performance: https://archive.is/bFsTF / https://www.uber.com/blog/logging/The gist of it is:- Structured logs (json) are stored as kv pairs in parallel arrays, along side metadata (host, timestamp, id, geo, namespace, etc).- Log fields (ie kv pairs) are materialized (indexed) depending on query patterns, and vaccummed up if unused.- Authoring queries and Kibana dashboard support is not trivial but handled with a query translation layer. atombender|3 years ago What do you mean by parallel arrays here?Do you mean something like two arrays [k1, ..., kN] and [v1, ..., vN] in two different columns?Is there a way in Clickhouse to filter such a pair of arrays such that you can do a search akin to vals[indexOfKey("foo")] == "bar"? load replies (1)
ignoramous|3 years ago Not OP but they might be referring to Uber moving from ES to ClickHouse to store their schema-flexible, structured logs, mostly to improve ingestion performance: https://archive.is/bFsTF / https://www.uber.com/blog/logging/The gist of it is:- Structured logs (json) are stored as kv pairs in parallel arrays, along side metadata (host, timestamp, id, geo, namespace, etc).- Log fields (ie kv pairs) are materialized (indexed) depending on query patterns, and vaccummed up if unused.- Authoring queries and Kibana dashboard support is not trivial but handled with a query translation layer. atombender|3 years ago What do you mean by parallel arrays here?Do you mean something like two arrays [k1, ..., kN] and [v1, ..., vN] in two different columns?Is there a way in Clickhouse to filter such a pair of arrays such that you can do a search akin to vals[indexOfKey("foo")] == "bar"? load replies (1)
atombender|3 years ago What do you mean by parallel arrays here?Do you mean something like two arrays [k1, ..., kN] and [v1, ..., vN] in two different columns?Is there a way in Clickhouse to filter such a pair of arrays such that you can do a search akin to vals[indexOfKey("foo")] == "bar"? load replies (1)
ignoramous|3 years ago
The gist of it is:
- Structured logs (json) are stored as kv pairs in parallel arrays, along side metadata (host, timestamp, id, geo, namespace, etc).
- Log fields (ie kv pairs) are materialized (indexed) depending on query patterns, and vaccummed up if unused.
- Authoring queries and Kibana dashboard support is not trivial but handled with a query translation layer.
atombender|3 years ago
Do you mean something like two arrays [k1, ..., kN] and [v1, ..., vN] in two different columns?
Is there a way in Clickhouse to filter such a pair of arrays such that you can do a search akin to vals[indexOfKey("foo")] == "bar"?