top | item 44909736

(no title)

ilmenit | 6 months ago

Trying to stress-test LLM Agents on RetroArch codebase (quite complex, with a lot of conditional compilation) Gemini 2.5 Pro admitted lack of own capabilities. That's actually nice behavior that I'd expect to see more from other models that don't stop and make more and more mess.

discuss

No comments yet.