At this point, anyone claiming that LLMs are "just" language models aren't arguing in good faith. LLMs are a general purpose computing paradigm. LLMs are circuit builders, the converged parameters define pathways through the architecture that pick out specific programs. Or as Karpathy puts it, LLMs are a differentiable computer[1]. Training LLMs discovers programs that well reproduce the input sequence. Tokens can represent anything, not just words. Roughly the same architecture can generate passable images, music, or even video.[1] https://x.com/karpathy/status/1582807367988654081
tovej|20 days ago
But it is extremely silly to say that "large language models are language models" is a bad faith argument.
hackinthebochs|20 days ago