top | item 38765226

(no title)

(For JPEG - Newer codecs may differ) The codec has these "Basis functions", 64 of them, which are used to encode and decode each 8x8 block of pixels. https://en.wikipedia.org/wiki/Discrete_cosine_transform#/med...

Sine and cosine waves have a property that you can approximate a signal by just taking the dot product with these basis functions to get a list of coefficients, and then you multiply those coefficients with the basis functions to get the original signal back. Not all functions are basis functions.

You can see that the upper-left one is all white, that's the "DC" (Direct Current) basis. As you go right and down, they increase in frequency.

So the encoder gets all the coefficients and then it quantizes the high-frequency ones to save bits. That's why JPEGs often have ringing / rippling artifacts where an edge will be sharp but have waves coming out on either side.

If you quantize the coefficients enough, then some of those bottom-right ones end up quantizing to zero. So JPEG encoders run a lossless compression step on the coefficients to squish all the zeroes and small values together. You can crunch a JPEG smaller by replacing this lossless compression with a newer algorithm.

And the decoder just inflates those coefficients and multiplies them by the same basis functions to get the bitmap back.

There's details I don't understand in the middle like loop filters and de-blocking filters to hide the 8x8 block artifacts, but the heart of it is just "take a dot product with these functions to encode, multiply those dots with the same functions to decode".

discuss

No comments yet.