top | item 4618018

(no title)

jaylevitt | 13 years ago

Comp sci folks: Is predicting whether a compressed file will produce a finite (or, better, reasonably-sized) output roughly equivalent to the halting problem?

discuss

lmkg|13 years ago

It depends on the decompression algorithm. It's possible for that to be the case, but this can only happen if the compressed binary format is essentially a Turing-complete language, for which your decompresser is the interpreter.

I'm not aware of any data formats for which that is the case, but from a theoretical standpoint, eval(s) is a perfectly cromulent decompression algorithm. This fact is essentially the starting point for Kolmogorov complexity.

"Reasonably-sized" is actually an interesting problem in itself. If your decompresser is sufficiently advanced, you could embed a busy-beaver function, which terminates but grows faster than any computable function. I have no idea whether such functions could be expressed with less-than-Turing-complete data formats.

dsl|13 years ago

RAR has a built in virtual machine for forwards compatibility with new compression algorithms.

cheald|13 years ago

No, because run-length encoding encodes the - well - run lengths in the file header. You can read those and know how big the resulting file will be without having to actually decompress the file.