top | item 46926484

(no title)

data_maan | 22 days ago

On the website https://1stproof.org/#about they claim: "This project represents our preliminary efforts to develop an objective and realistic methodology for assessing the capabilities of AI systems to autonomously solve research-level math questions."

Sounds to me to be a benchmark in all but a name. And they failed pretty terribly at achieving what they set out to do.

discuss

bwfan123|22 days ago

> And they failed pretty terribly at achieving what they set out to do.

Why the angst ? If the ai can autonomously solve these problems, isnt that a huge step forward for the field.

data_maan|22 days ago

It's not angst. It's intense frustration that they 1) are not doing the science correctly, and 2) that others (e.g. FrontierMath) already did everything they claim to be doing, so we won't learn anything new here, but somehow 1stproof get all the credit.