(no title)
throwaway482945 | 1 year ago
To be clear, I am not saying there are no limits to what LLMs can do, I just don't get how people can be so sure one way or the other. Especially when you consider that this technology is evolving at such an unpredictable pace.
appplication|1 year ago
Simplifying a bit, but attention provides a way for the model to build context on one word based on how often it is seen with others. It doesn’t have a concept of correct or incorrect. It doesn’t have a concept of reasoning.
What is impressive is that even without these concepts of correctness and reasoning, the model can still perform quite well on tasks where correctness and reasoning would be expected. But this is more a statement on the corpus of knowledge and the power of language in general than it is on the models capabilities itself. It’s important not to confuse the ability to seem correct and seem well reasoned with any actual mechanism to do so.
ofrzeta|1 year ago
See the comment on the "Golden Gate Bridge" version of Claude:
"The fact that we can find and alter these features within Claude makes us more confident that we’re beginning to understand how large language models really work." (emphasis mine)
https://www.anthropic.com/news/golden-gate-claude
opprobium|1 year ago
mewpmewp2|1 year ago