top | item 36229796

(no title)

marsupialtail_2 | 2 years ago

While we are on this topic, the challenge with data lakes for Python based projects like Daft and Quokka (what I work on) is the poor Python support for data lakes like Delta, Iceberg and Hudi. Delta has the best support but its Python API is consistently behind the Java ones. Iceberg doesn't support Python writes. Hudi doesn't support anything Python.

I have users demanding Iceberg writes and Hudi reads/writes. I don't know what to tell them, since I don't have the resources to add a reader/writer myself for those projects.

Hopefully as DuckDB becomes more popular we will see Python bindings for these popular data lake formats this year.

discuss

order

No comments yet.