top | item 42551056

(no title)

I haven’t played around with it too much myself, but I remember reading that gzip (or at least python’s compatible zlib library) supports a “seed dictionary” of expected fragments”.

I gather that you’d supply the same “seed” during both compression and decompression, and this would reduce the amount of information embedded into the compressed result.

discuss

duskwuff|1 year ago

Many other compression libraries, like zstd, support functionality along those lines. For that matter, brotli's big party trick is having a built-in dictionary, tuned for web content.

It's easy to implement in LZ-style compressors - it amounts to injecting the dictionary as context, as if it had been previously output by the decompressor. (There's a striking parallel to how LLM prompting works.)