(no title)
foddiangames | 10 months ago
GPT-4o, Claude Sonnet 3.7, Gemini 2.5 Pro, and Gemini 2.0 Flash playing Doom II (default difficulty) on VideoGameBench-Lite with the same input prompt! Models achieve varying levels of success but none are able to pass even the first level.
No comments yet.