https://arxiv.org/abs/2406.02061v1 research paper shows the reasoning breakdown in SOTA LLMs by asking a simple question, “Alice has N brothers and she also has M sisters. How many sisters does Alice’s brother have?” I investigated performance of different prompts on this question, and show that 'Expand-then solve' prompt significantly outperforms standard and chain-of-thought prompts.
No comments yet.