top | item 46500575

(no title)

jupitr | 1 month ago

This is quite seriously missing the point: computational theorem-provers and contributions to mathlib (for Lean, anyway) allow for checking "correctness". The LLM is one of the many ways to search for tactics that help complete a proof for a theorem not yet in mathlib.

The LLM is not done until the theorem-prover says it is done - at which point we can be sure of the correctness of the assertion in the context. AFAIK there is no debate about inaccuracy with theorem-provers. Their real limitation continues to be converging in realistic timescales and efficient search over all combinations of proof tactics.

discuss

watchthedemo|1 month ago

> The LLM is not done until the theorem-prover says it is done - at which point we can be sure of the correctness of the assertion in the context. AFAIK there is no debate about inaccuracy with theorem-provers.

Your "in the context" is doing a lot of heavy lifting here, see:

- The Math Is Haunted : https://news.ycombinator.com/item?id=44739315 : https://overreacted.io/the-math-is-haunted/

- Three ways formally verified code can go wrong in practice : https://news.ycombinator.com/item?id=45555727 : https://buttondown.com/hillelwayne/archive/three-ways-formal...

- Breaking “provably correct” Leftpad : https://news.ycombinator.com/item?id=45492274 : https://lukeplant.me.uk/blog/posts/breaking-provably-correct...

- what does a lean proof prove? : https://news.ycombinator.com/item?id=46286605 : https://supaiku.com/what-does-a-lean-proof-prove

> Their real limitation continues to be converging in realistic timescales and efficient search over all combinations of proof tactics.

I agree "efficient search over all combination of proof tactics" will be a defining moment in the field. I disagree these LLMs are it.