top | item 32887411

(no title)

phao | 3 years ago

Where could one go to read more about the mathematics behind the format, its compression techniques, etc? I remember reading that jpeg 2000 is based on wavelets. Is this the case for jpeg xl?

discuss

order

lifthrasiir|3 years ago

There are multiple strategies working in tandam.

At the very bottom the entropy coding uses a hybrid of LZ77, Huffman coding and multi-symbol rANS. This is complemented with context modeling which should be pretty familiar to anyone knows Brotli.

For the lossless (modular) mode the main strategy involves finding a good decision tree to compute a prediction for each pixel, which can be learned for each instance (thus named "meta-adaptive"). This mode allows for a number of additional transformations, some of which also function as a progressive encoding.

For the lossy (VarDCT) mode the actual transformation is a (vast) superset of the original JPEG, but a lot of more transformations---mostly related to DCT---are available and there are tons of contexts for each coefficient that can be exploited. Not exactly specific to JPEG XL, but libjxl also features a very good psychovisual model to optimize the resulting visual quality.

Besides from those main modes, there are additional whole-image transformations, color transformations (images can use an absolute color space named XYB when they are allowed to be lossy) and image features. Lastly, images can have multiple frames, some of which are animated and some of which can be merged together when the frame duration is zero, with fully configurable blending modes.