top | item 44443846

(no title)

angusturner | 8 months ago

There is an excellent talk by Jack Rae called “compression for AGI”, where he shows (what I believe to be) a little known connection between transformers and compression;

In one view, you can view LLMs as SOTA lossless compression algorithms, where the number of weights don’t count towards the description length. Sounds crazy but it’s true.

discuss

order

Workaccount2|8 months ago

A transformer that doesn't hallucinate (or knows what is a hallucination) would be the ultimate compression algorithm. But right now that isn't a solved problem, and it leaves the output of LLMs too untrustworthy to use over what are colloquially known as compression algorithms.

Nevermark|8 months ago

It is still task related.

Compressing a comprehensive command line reference via model might introduce errors and drop some options.

But for many people, especially new users, referencing commands, and getting examples, via a model would delivers many times the value.

Lossy vs. lossless are fundamentally different, but so are use cases.