(no title)
loxias | 1 year ago
Ahhhh. Now we're on the same page. :) Seeing how it helps when combined is somewhat subtle/non-obvious. I believe it relates to BWT and Huffman both being approximations of something more optimal. The two transforms could also have different window sizes -- one rarely does BWT on a whole 1GB file -- which introduce inefficiencies. Huffman coding is also only optimal in the very large alphabet and very long data limits. As your data length and alphabet size decrease, it gets less optimal.
Put differently, "I think that's a wonderfully phrased question, this _is_ my specialization/subfield, and I'm gonna need to chew on it for a while."
crazygringo|1 year ago
Sesse__|1 year ago
BWT is orthogonal to Huffman; like LZ, it exploits that some symbol _sequences_ are more common than others, while Huffman is about the optimality of coding each symbol on its own.