top | item 47137272

(no title)

Sure, if an open ended response was allowed, but if it was a multiple choice question then you'd have to use your common sense and pick one.

However, the important issue here really isn't about the ability of humans or LLMs to recognize logic puzzles. If you were asking an LLM for real world advice, trying to be as straightforward as possible, you may still get a response just as bad as "walk", but not be able to recognize that it was bad, and the reason for the failure would be exactly the same as here - failure to plan and reason through consequences.

It's toy problems like this that should make you step back once in a while and remind yourself of how LLMs are built and how they are therefore going to fail.

discuss

No comments yet.