(no title)
pacbard | 11 months ago
A Parquet file is a static file that has the whole data associated with a table. You can't insert, update, delete, etc. It's just it. It works ok if you have small tables, but it becomes unwieldy if you need to do whole-table replacements each time your data changes.
Apache Iceberg fixes this problem by adding a metadata layer on top of smaller Parquet files (at a 300,000 ft overview).
pgwhalen|11 months ago
hobs|11 months ago
inkyoto|11 months ago
So, it is CSV++ so to speak, or CSV + metadata + compact data storage in a singular file, but not a database table gone astray to wander the world on its own as a file.
victor106|11 months ago
Delta format also supports this, correct?
orthoxerox|11 months ago