top | item 45314274

(no title)

fxj | 5 months ago

Learning == Compression of information.

It can be a description by a shorter bit length. Think Shannon Entropy and the measure of information content. The information is still in the weights but it is reorganized and the reconstructed sentences (or lists of tokens) will not provide the same exact bits but the information is still there.

discuss

shawntan|5 months ago

The compression is lossy.