(no title)
wppick | 4 months ago
When I write prompts, I've stopped thinking of LLMs as just predicting a next word, and instead to think that they are a logical model built up by combining the logic of all the text they've seen. I think of the LLM as knowing that cats don't lay eggs, and when I ask it to finish the sentence "cats lay ..." It won't generate the word eggs even though eggs probably comes after lay frequently
godelski|4 months ago
Next token prediction is still context based. It does not depend on only the previous token, but on the previous (N-1) tokens. You have "cat" so you should get words like "down" instead of "eggs" with even a 3-gram (trigram) model.
devmor|4 months ago
What you are seeing is a semi-randomized prediction engine. It does not "know" things, it only shows you an approximation of what a completion of its system prompt and your prompt combined would look like, when extrapolated from its training corpus.
What you've mistaken for a "logical model" is simply a large amount of repeated information. To show the difference between this and logic, you need only look at something like the "seahorse emoji" case.
Philpax|4 months ago
nearbuy|4 months ago
So what is it repeating?
It's not enough to just point to an instance of LLMs producing weird or dumb output. You need to show how it fits with your theory that they "just repeating information". This is like pointing out one of the millions of times a person has said something weird, dumb, or nonsensical and claiming it proves humans can't think and can only repeat information.
krackers|4 months ago
Surely trained neural networks could never develop circuits that implement actual logic via computational graphs...
https://transformer-circuits.pub/2025/attribution-graphs/met...