top | item 45750685

(no title)

srush | 4 months ago

Unfortunately not, as we used our own internal code for the benchmark. We would also like to see more benchmarks that reflect the day-to-day agentic coding use.

discuss

gabriel666smith|4 months ago

Is there any information at all available, anywhere, on what Cursor Bench is testing and how?

It's the most prominent part of the release post - but it's really hard to understand what exactly it's saying.

srush|4 months ago

Roughly, we had Cursor software engineers record real questions they were asking models, and then had them record the PR that they made that contained the result. We then cleaned these up. That is the benchmark.