top | item 47186187 Why SWE-bench Verified no longer measures frontier coding capabilities 2 points| gmays | 3 days ago |openai.com discuss order hn newest agentica_ai|3 days ago [deleted]
agentica_ai|3 days ago
[deleted]