top | item 40372477

(no title)

I've seen this idea that "LLMs are just guessing the next token" repeated everywhere. It is true that accuracy in that task is what the training algorithms aim at. That is not however, what the output of the model represents in use, in my opinion. I suspect the process is better understood as predicting the next concept, not the next token. As the procedure passes from one level to the next, this concept morphs from a simple token to an ever more abstract representation of an idea. That representation (and all the others being created elsewhere from the text) interact to form the next, even more abstract concept. In this way ideas "close" to each other become combined and can fuse into each other, until an "intelligent" final output is generated. It is true that the present configuration doesn't offer the LLM a very good way to look back to see what its output has been doing, and I suspect that kind of feedback will be necessary for big improvements in performance. Clearly, there is an integration of information occurring, and it is interesting to contemplate how that plays into G. Tononi's definition of consciousness in his "information integration theory".

discuss

8crazyideas|1 year ago

Also, as far as hallucinations go, no symbolic representation of a set of concepts can distinguish reality from fantasy. Disconnect a human from their senses and they will hallucinate too. For progress in this, the LLM will have to be connected in some way to the reality of the world, like our senses and physical body connect us. Only then they can compare their "thoughts" and "beliefs" to reality. Insisting they at least check their output against facts as recorded by what we already consider reliable sources is the obvious first step. For example, I made a GPT called "Medicine in Context" to educate users; I wanted to call it "Reliable Knowledge: Medicine" because of the desperate need for ordinary people to get reliable medical information, but of course I wouldn't dare. It would be very irresponsible. It is clear that the GPT would have to be built to check every substantive fact against reality, and ideally to remember such established facts going into the future. Over time, it would accumulate true expertise.