top | item 39906375

(no title)

Bulat_Ziganshin | 1 year ago

afaik, 7-zip filters can't have multiple inputs (at the encoding stage).

multiple outputs are necessary for filters that output multiple independent data streams such as bcj2. and they are equally useful for archivers and compressors.

(I'm author of freearc, another archiver software, and multiple compression algos)

PS: thank you for format comparison, it would be great to put xz format description onto its Wikipedia page. I already used you description to understand why attackers added 8 "random" bytes to one of their scripts - probably to "fix" crc-64 value.

discuss

order

lifthrasiir|1 year ago

(Welcome to the HN, by the way! While I have also written my own compressor, I'm just a hobbyist compared to people in encode.su, to be sure :-)

> afaik, 7-zip filters can't have multiple inputs (at the encoding stage).

I don't know whether this is actually used or not, but py7zr's specification clearly mentions that complex coders have `NumInStreams` and `NumOutStreams` parameters specified.

> multiple outputs are necessary for filters that output multiple independent data streams such as bcj2. and they are equally useful for archivers and compressors.

Correct in principle, but those logical streams can still be framed and concatenated into a single physical stream. Any compressor that can handle non-stationary distributions would work well, it might be even possible to hint the logical stream boundary to guide compressors.

Several physical streams are absolutely necessary when those streams are going through different filters, but a lack of them doesn't harm too much either. Also I think xz was planning for "subblock" filters to address this use case anyway.