top | item 46561244

(no title)

thomasahle | 1 month ago

It took Andrew Wiles 7 years of intense work to solve Fermat's Last Theorem.

The METR institute predicts that the length of tasks AI agents can complete doubles every 7 months.

We should expect it to take until 2033 before AI solves Clay Institute-level problems with 50% reliability.

discuss

kelseyfrog|1 month ago

That's exactly why the Millennium Prize Problem Bench[1] was created.

1. https://mppbench.com/

thomasahle|1 month ago

That's amazing :D

zozbot234|1 month ago

There is an ongoing effort to formalize a modern, streamlined proof of FLT in Lean, with all the needed prereqs. It's estimated that it will take approx. 5 years, but perhaps AI will lead to some meaningful speedup.

pfdietz|1 month ago

What I'm hoping to see is high volume automated formalization of the math literature, with the goal of formalizing (or finding flaws in) the entire thing.

And once we have that formalized corpus, it's all set up as training data for moving forward.

Davidzheng|1 month ago

If you have a sufficiently strong verifier 1/100000 reliability is already enough

thomasahle|1 month ago

Sure, but then 50% reliability just becomes a matter of whether you can make a strong enough verifier.