top | item 10860652

(no title)

adolgert | 10 years ago

I may not agree with Cyrille, but what about alternatives for storing binary data that might be structured and play well with newer tools like Spark? ASN.1 and Google Protocol Buffers both specify a binary file format and generate language-specific encoding and decoding. Is there a set of lightweight binary data tools we're missing?

discuss

order

santaclaus|10 years ago

How widely supported are the alternatives in the wider ecosystem? It is trivial to read and write HDF5 files in Python, Matlab, Mathematica, etc.

adolgert|10 years ago

That's a good question. Both ASN.1 and Google's offering have more limited language coverage (ASN.1 is ancient, but venerable, now in the hands of NCBI), but maybe we should expand that list. These are tools that serialize buffers with razor-sharp binary specifications. I, too, use HDF5 for all of its features, but maybe someone who is rolling their own, for instance, under Spark, should have a solid binary specification.