top | item 46037877 (no title) grantpitt | 3 months ago do say more discuss order hn newest GodelNumbering|3 months ago Makes it sound like a one trick pony jascha_eng|3 months ago Anthropic is leaning into agentic coding and heavily so. It makes sense to use swe verified as their main benchmark. It is also the one benchmark Google did not get the top spot last week. Claude remains king that's all that matters here. load replies (1) grantpitt|3 months ago well, it's a big trick
GodelNumbering|3 months ago Makes it sound like a one trick pony jascha_eng|3 months ago Anthropic is leaning into agentic coding and heavily so. It makes sense to use swe verified as their main benchmark. It is also the one benchmark Google did not get the top spot last week. Claude remains king that's all that matters here. load replies (1) grantpitt|3 months ago well, it's a big trick
jascha_eng|3 months ago Anthropic is leaning into agentic coding and heavily so. It makes sense to use swe verified as their main benchmark. It is also the one benchmark Google did not get the top spot last week. Claude remains king that's all that matters here. load replies (1)
GodelNumbering|3 months ago
jascha_eng|3 months ago
grantpitt|3 months ago