(no title)
fodkodrasz | 1 month ago
I think there are solutions for that scale of data already, and simplicity is the best feature of DuckDB (at lest for me).
fodkodrasz | 1 month ago
I think there are solutions for that scale of data already, and simplicity is the best feature of DuckDB (at lest for me).
augusteo|1 month ago
This is a fair point, but I think there's a middle ground. DuckDB handles surprisingly large datasets on a single machine, but "surprisingly large" still has limits. If you're querying 10TB of parquet files across S3, even DuckDB needs help.
The question is whether Ray is the right distributed layer for this. Curious what the alternative would beāSpark feels like overkill, but rolling your own coordination is painful.
AnEro|1 month ago