top | item 34226664

(no title)

AdamProut | 3 years ago

Most SQL Analytical databases don't discourage joins (think BigQuery, Redshift, Snowflake, etc.) and all of them are columnstores. I think discouraging joins is something very specific to Clickhouse, Druid, Pinot and others that have very limited support for joins.

discuss

order

hodgesrm|3 years ago

ClickHouse at least supports local joins quite well.

Perhaps another way to put it is that BigQuery, Redshift, and Snowflake are not optimized for real-time response on large, wide tables. ClickHouse has features that allow it to pack multiple entities in a single table, then pull data out in a single scan. This includes tricks like simulating joins using aggregation. [0] It's a great design for feeding tenant dashboards with fixed latency (say 2s or less) and predictable cost. This use case is shared by many SaaS offerings.

Over time I think the differences will become less as current database engines converge on commonly required features. ClickHouse join types have expanded over the last year and features like join reordering are in the 2023 roadmap. [1] Conversely incumbent cloud databases are adding features to support real-time analytics.

I work on ClickHouse at Altinity and can't speak for Druid and Pinot. Perhaps someone else with detailed knowledge can chip in.

[0] https://www.databricks.com/dataaisummit/session/opening-floo...

[1] https://github.com/ClickHouse/ClickHouse/issues/44767