(no title)
sottol | 8 months ago
Not sure if OpenAI has updated O3, but it looks like "pure" o3 (high) has a score of 79.6% in the linked table, "o3 (high) + gpt-4.1" combo has a the highest score of 82.7%.
The previous Gemini 2.5 Pro Preview 05-06 (yea, not current 06-05!) was at 76.9%.
That looks like a pretty nice bump!
But either way, these Aider benchmarks seem to be most useful/trustworthy benchmarks currently and really the only ones I'm paying attention to.
No comments yet.