top | item 12633486

ZSON, PostgreSQL extension for compressing JSONB

136 points| afiskon | 9 years ago |postgresql.org

13 comments

order
[+] brianolson|9 years ago|reply
Another way to make 'JSON' smaller is to instead use 'CBOR', schema compatible 'concise binary object representation'. (See IETF RFC 7049 or http://cbor.io/ ) CBOR encodes and decodes faster too. Or use the 'snappy' compressor.
[+] afiskon|9 years ago|reply
I'm afraid Snappy will not help, at least a lot. PostgreSQL already has a build-in compression (PGLZ). I tested various different algorithms before this shared-dictionary-idea - lz4, bzip and others. Some compress a bit better, other a bit faster, but in general result is almost the same.
[+] jkot|9 years ago|reply
It would be nice to automate dictionary training. Make it part of vacuum.
[+] afiskon|9 years ago|reply
Thank you for an interesting idea! I can't promise I will implement it myself any time soon, however. You know the saying - pull requests are welcome :)
[+] exo762|9 years ago|reply
It would be nice to see a comparison to a general compression algorithm - e.g. deflate.
[+] vog|9 years ago|reply
There is a comparison table at the end of the project's README (https://github.com/afiskon/zson). However, the table columns are not explained very well. It is not 100% clear to me whether "before" means "uncompressed" or "PGLZ compressed".

  Compression ratio could be different depending on documents,
  database schema, number of rows, etc. But in general ZSON
  compression is much better than build-in PostgreSQL
  compression (PGLZ):

     before   |   after    |      ratio       
  ------------+------------+------------------
   3961880576 | 1638834176 | 0.41365057440843
  (1 row)
  
     before   |   after    |       ratio       
  ------------+------------+-------------------
   8058904576 | 4916436992 | 0.610062688500061
  (1 row)
  
     before    |   after    |       ratio       
  -------------+------------+-------------------
   14204420096 | 9832841216 | 0.692238130775149
[+] X86BSD|9 years ago|reply
How does this work while running zfs with LZ4?

Has this even been tested on zfs?

[+] afiskon|9 years ago|reply
No, I didn't test it on ZFS. Sorry for asking, but is there any reason to run DBMS (which I would like to remind has a build-in compression) on ZFS which itself is a small DBMS? Sounds like too much unnecessary overhead to me.