top | item 44827501

(no title)

haffi112 | 6 months ago

It makes it look like the presentation is rushed or made last minute. Really bad to see this as the first plot in the whole presentation. Also, I would have loved to see comparisons with Opus 4.1.

Edit: Opus 4.1 scores 74.5% (https://www.anthropic.com/news/claude-opus-4-1). This makes it sound like Anthropic released the upgrade to still be the leader on this important benchmark.

discuss

danpalmer|6 months ago

> like the presentation is rushed or made last minute

Or written by GPT-5?

herval|6 months ago

They never compare with other vendors