top | item 46812297

(no title)

eterm | 1 month ago

4. The graph starts January 8.

Why January 8? Was that an outlier high point?

IIRC, Opus 4.5 was released late november.

discuss

order

F7F7F7|1 month ago

Right after the Holiday double token promotion users felt (perceived) a huge regression in capabilities. I bet that triggered the idea.

pertymcpert|1 month ago

People were away for the holidays. What do you want them to do?

littlestymaar|1 month ago

Or maybe, juste maybe, that's when they started testing…

eterm|1 month ago

Wayback machine has nothing for this site before today, and article is "last updated Jan 29".

A benchmark like this ought to start fresh from when it is published.

I don't entirely doubt the degradation, but the choice of where they went back to feels a bit cherry-picked to demonstrate the value of the benchmark.