I wrote this because I was fascinated by the gap between an LLM's ability to talk about chess and yet and consistently make legal moves.
Even with huge context windows and massive training sets, LLMs consistently fail at basic chess logic once they are taken "out of book." In the post, I break down why this isn't just a matter of more data, but a fundamental mismatch between transformer architecture (probabilistic token prediction) and the symbolic, state-based requirements of a 64-square grid.
nicowesterdale|5 days ago