Because it means you need to keep another copy of your data in a special format just for DuckDb. The point of Parquet is that it’s an open format queryable by multiple tools. You don’t need to wait to load every table into a new format, you don’t need to retain multiple copies, and you don’t need to keep them in sync.
If DuckDb is the only query engine in your analytics stack, then it makes sense to use its specialized format. But that’s not the typical Lakehouse use case.
riku_iki|1 year ago
chatmasta|1 year ago
If DuckDb is the only query engine in your analytics stack, then it makes sense to use its specialized format. But that’s not the typical Lakehouse use case.