top | item 33322800

(no title)

abrazensunset | 3 years ago

"Lakehouse" usually means a data lake (bunch of files in object storage with some arbitrary structure) that has an open source "table format" making it act like a database. E.g. using Iceberg or Delta Lake to handle deletes, transactions, concurrency control on top of parquet (the "file format").

The advantage is that various query engines will make it quack like a database, but you have a completely open interop layer that will let any combination of query engines (or just SDKs that implement the table format, or whatever) coexist. And in addition, you can feel good about "owning" your data and not being overtly locked in to Snowflake or Databricks.

discuss

order

No comments yet.