top | item 43540379

#1 open-source agent on SWE-Bench Verified by combining Claude 3.7 and O1

9 points| emmabotbot | 11 months ago |augmentcode.com

3 comments

order

colinflaherty|11 months ago

Colin here, author of the post - would love to answer questions about this.

And make sure to try out the open-source repo! It's a super easy starting point for experimenting with coding agents. It's nearly one-click to run agents in isolated Docker containers on SWE-bench Verified problems, ensemble the results, and run the SWE-bench evaluation harness to compute scores.

Check it out here: https://github.com/augmentcode/augment-swebench-agent

arunchaganty|11 months ago

Nice! I know it's super hard to sota a benchmark, especially a few years into it, so congrats on the milestone!

nuatsimon|11 months ago

huge props to the team for open sourcing this