top | item 45248164

(no title)

leobuskin | 5 months ago

What about a specialized dict for FASTA? Shouldn't it increase ZSTD compression significantly?

discuss

order

bede|5 months ago

Yes I'd expect a dict-based approach to do better here. That's probably how it should be done. But --long is compelling for me because using it requires almost no effort, it's still very fast, and yet it can dramatically improve compression ratio.

tecleandor|5 months ago

From what I've read (although I haven't tested and I can't find my source from when I read it), dictionaries aren't very useful when dataset is big, and just by using '--long' you can cover that improvement.

Have any of you tested it?