top | item 40642453

(no title)

miven | 1 year ago

> For on-device inference, we use low-bit palletization, a critical optimization technique that achieves the necessary memory, power, and performance requirements.

Did they go over the entire text with a thesaurus? I've never seen "palletization" be used as a viable synonym for "quantization" before, and I've read quite a few papers on LLM quantization

discuss

order

bagrow|1 year ago

miven|1 year ago

Huh, generally whenever I saw the lookup table approach in literature it was also referred to as quantization, guess they wanted to disambiguate the two methods

Though I'm not sure how warranted it really is, in both cases it's still pretty much the same idea of reducing the precision, just with different implementations

Edit: they even refer to it as LUT quantization on another page: https://apple.github.io/coremltools/docs-guides/source/quant...

fudged71|1 year ago

404

elcritch|1 year ago

Huh, it’s PNG for AI weights.

cgearhart|1 year ago

I also found it confusing the first time I saw it. I believe it is sometimes used because the techniques for DL are very similar (in some cases identical) to algorithms that were developed for color palette quantization (in some places shortened to "palettization"). [1] At this point my understanding is that this term is used to be more specific about the type of quantization being performed.

https://en.wikipedia.org/wiki/Color_quantization

dialup_sounds|1 year ago

I enjoy the plausible irony that they used the very same model they're describing to proofread the article, and it didn't catch palettize (like a color palette) vs. palletize (like a shipping pallet).