top | item 43872598

(no title)

invalidroot | 10 months ago

Nice writeup! This is the second post I've seen in the genre of "I've had a secret, personal benchmark for LLMs where the 'solution' requires questioning the premises, and o4-mini-high beats it." The first post I saw was about a chessboard and the prompt "mate in one:" https://x.com/KelseyTuoc/status/1912945346126417940

(Edited to remove direct spoiler for the MU-puzzle, in case people want to try it.)

discuss

order

No comments yet.