No, your opinion is wrong because the reason some models don't seem to have some "strong opinion" on anything is not related to predicting words based on how similar they are to other sentences in the training data. It's most likely related to how the model was trained with reinforcement learning, and most specifically, to recent efforts by OpenAI to reduce hallucination rates by penalizing guessing under uncertainty[1].[1] https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4a...
hansmayer|13 days ago
andy12_|13 days ago
- An LLM that works through completely different mechanisms, like predicting masked words, predicting the previous word, or predicting several words at a time.
- A normal traditional program, like a calculator, encoded as an autoregressive transformer that calculates its output one word at a time (compiled neural networks) [1][2]
So saying "it predicts the next word" is a nothing-burger. That a program calculates its output one token at a time tells you nothing about its behavior.
[1] https://arxiv.org/pdf/2106.06981
[2] https://wengsyx.github.io/NC/static/paper_iclr.pdf