top | item 46305801

(no title)

minraws | 2 months ago

I think you are proposing something that's orthogonal to the OP's point.

They mentioned the training data is much higher for an LLM, LLM's recall not being uniform was never in question.

No one expects compression to be without loss when you scale below knowledge entropy that exists in your training set.

I am not saying LLMs do simple compression but just pointing a mathematical certainity.

(And I think you don't need to be an expert in creating LLMs to understand them, albeit I think a lot of people here have experience with it aswell so I find the additional emphasis on it moot).

discuss

omneity|2 months ago

The way I understood OP’s point is that because LLMs have been trained on the entirety of humanity’s knowledge (exemplified by the internet), then surely they know as much as the entirety of humanity. A cursory use of an LLM shows this is obviously not true, but I am also raising the point that LLMs are only summoning a limited subset of that knowledge at a time when answering any given prompt, bringing them closer to a human polymath than an omniscient entity, and larger LLMs only seem to improve on the “depth” of that polymath knowledge rather than the breadth of it.

Again just my impression from exposure to many LLMs at various states of training (my last sentence was not an appeal to expertise)