Absolutely. They hooked up an LM and asked it to talk like it's thinking. But LMs like GPT are token predictors, and purely language models. They have no mental model, no intentionality, and no agency. They don't think.
This is pure anthropomorphization. But so it always is with pop sci articles about AI.
It's quite an odd setup. If we presuppose the "agent" is smart enough to knowingly cheat, would it then also not be smart enough to knowingly lie?
All I really get out of this experiment is that there are weights in there that encode the fact that it's doing an invalid move. The rules of chess are in there. With that knowledge it's not surprising that the most likely text generated when doing an invalid move is an explanation for the invalid move. It would be more surprising if it completely ignored it.
It's not really cheating, it's weighing the possibility of there being an invalid move at this position, conditioned by the prompt, higher than there being a valid move. There's no planning, it's all statistics.
You could create a non-intelligent chess playing program that cheats. It’s not about the scratchpad. It’s trying to answer a question if a language model, given an opportunity, could circumvent the rules over failing the task.
> could circumvent the rules over failing the task.
or the whole thing is just a reflection of the rules being incorrectly specified. As others have noted, minor variations in how rules are described can lead to wildly different possible outcomes. We might want to label an LLM's behavior as "circumventing", but that may be because our understanding of what the rules allow and disallow is incorrect (at least compared to the LLM's "understanding").
I suspect that this commonplace notion about the depth of our own mental models is being overly generous to ourselves. AI has a long way to go with working memory, but not as far as portrayed here.
delusional|1 year ago
All I really get out of this experiment is that there are weights in there that encode the fact that it's doing an invalid move. The rules of chess are in there. With that knowledge it's not surprising that the most likely text generated when doing an invalid move is an explanation for the invalid move. It would be more surprising if it completely ignored it.
It's not really cheating, it's weighing the possibility of there being an invalid move at this position, conditioned by the prompt, higher than there being a valid move. There's no planning, it's all statistics.
philipov|1 year ago
The chorus line of every human ever attempting to rationalize cheating.
exitb|1 year ago
PaulDavisThe1st|1 year ago
or the whole thing is just a reflection of the rules being incorrectly specified. As others have noted, minor variations in how rules are described can lead to wildly different possible outcomes. We might want to label an LLM's behavior as "circumventing", but that may be because our understanding of what the rules allow and disallow is incorrect (at least compared to the LLM's "understanding").
IshKebab|1 year ago
philipov|1 year ago