top | item 45532773

(no title)

I find it much more intuitive to think of LLMs as fuzzy-indexed frequency based searches combined with grammatically correct probabilistic word generators.

They have no concept of truth or validity, but the frequency of inputs into their training data provides a kind of psuedo check and natural approximation to truth as long as frequency and relationships in the training data also has some relationship to truth.

For a lot of textbook coding type stuff that actually holds: frameworks, shell commands, regexes, common queries and patterns. There's lots of it out there and generally the more common form is spreading some measure of validity.

My experience though is that on niche topics, sparse areas, topics that humans are likely to be emotionally or politically engaged with (and therefore not approximate truth), or things that are recent and therefore haven't had time to generate sufficient frequency, they can get thrown off. And of course it also has no concept of whether what it is finding or reporting is true or not.

This also explains why they have trouble with genuine new programming and not just reimplementing frameworks or common applications because they lack the frequency based or probabilistic grounding to truth and because the new combinations of libraries and code leads to place of relative sparsity in it's weights that leave them unable to function.

The literature/marketing has taken to calling this hallucination, but it's just as easy to think of it as errors produced by probabilistic generation and/or sparsity.

discuss

No comments yet.