top | item 31412323

(no title)

mccanne | 3 years ago

Author here. Agreed! Validation is important. While I didn't make this point in the article, our thinking is schema validation does not require that the serialization format utilize schemas as the building block and you can always implementation schema (or type) validation (and versioning) on top of super-structured data (as can also be done with document databases).

discuss

order

cmollis|3 years ago

this is a major hassle when converting from avro (from kafka which uses a schema registry, so schemas are not shipped with the avro data) and storing in parquet which requires a schema in the file but you can 'upgrade' it with another schema when reading it. It would be great to have a binary protocol-like format (schema-less avro), and a schema-less columnar storage format.. which is I guess is what these guys are doing.