top | item 46037877

(no title)

grantpitt | 3 months ago

do say more

discuss

order

GodelNumbering|3 months ago

Makes it sound like a one trick pony

jascha_eng|3 months ago

Anthropic is leaning into agentic coding and heavily so. It makes sense to use swe verified as their main benchmark. It is also the one benchmark Google did not get the top spot last week. Claude remains king that's all that matters here.

grantpitt|3 months ago

well, it's a big trick