It's slightly less (or more?) impressive when you see that it thought for 45 seconds (!), and basically hand-executed the code to see what it did. I'd love to know how many tokens that actually took. Worth remembering that LLMs can be viewed as term rewrite engines, and as such can compute anything given enough space.
dTal|1 year ago