top | item 29920426

(no title)

rdubz | 4 years ago

Back of envelope:

The 10MB estimated size came from [100 bytes per row] * [100k rows].

50 of the bytes per row were "description", which should compress well (2-3x, I'd guess).

40 bytes per row were the IPFS ID/hash, IIUC. I assumed this is like a Git hash, 40 hex chars, which is really just 20 bytes of entropy.

He also estimated 14 bytes for the size (stored as a string representation of a decimal integer, up to 1e15 - 1, or 1PB?). That's about 50 bits or 6-7 bytes, as a binary integer. Sizes wouldn't be uniformly distributed though so it would compress to even fewer bytes.

So if SQLite was smart (or one gzips the whole db file, like you did), it makes sense that a factor of 2 or so is reclaimable.

discuss

No comments yet.