(no title)
robrenaud | 2 months ago
Low level, automated theorem providing is going to fall way quicker than most expected, like AlphaGo, precisely because an MCTS++ search over lean proofs is scalable/amendable to self play/relevant to a significant chunk of professional math.
Legit, I almost wish the US and China would sign a Formal Mathematics Profileration Treaty, as a sign of good will between very powerful parties who have much to gain from each other. When your theorem prover is sufficiently better than most Fields medalists alive, you share your arch/algorithms/process with the world. So Mathematics stays in the shared realm of human culture, and it doesn't just happen to belong to DeepMind, OpenAI, or Deepseek.
baq|2 months ago
The workflow I'm envisioning here is the plan document we're all making nowadays isn't being translated directly into code, but into a TLA+/Alloy/... model as executable docs and only then lowered into the code space while conformance is continuously monitored (which is where the toil makes it not worth it most of the time without LLMs). The AI literature search for similar problems and solutions is also obviously helpful during all phases of the sweng process.
robot-wrangler|2 months ago
I'm sure we've agreed on this before, but I agree again ;) There are dozens of us at least, dozens! There's also a recent uptick in posts with related ideas, for example this hit the front-page briefly ( https://news.ycombinator.com/item?id=46251667 ).
I was tempted to start with alloy/tla for my own experiments along these lines due to their popularity, but since the available training data is so minimal for everything in the space.. I went with something more obscure (MCMAS) just for access to "agents" as primitives in the model-checker.
gaigalas|2 months ago
I'm curious if there is a scenario in which a large automated proof is achieved but there would be no practical means of getting any understanding of what it means.
I'm an engineer. Think like this: a large complex program that compiles but you don't understand what it does or how to use it. Is such a thing possible?
JulianWasTaken|2 months ago
That's true though of Lean code written by a human mathematician.
AI systems are capable (and generally even predisposed to) producing long and roundabout proofs which are a slog to decipher. So yes the feeling is somewhat similar at times to an LLM giving you a large and sometimes even redundant-in-parts program.
gus_massa|2 months ago
Sometimes the original proof is compleyely replaced, bit by bit, until there is an easy to understand version.
QuesnayJr|2 months ago
https://en.wikipedia.org/wiki/Four_color_theorem
https://en.wikipedia.org/wiki/Kepler_conjecture
tug2024|2 months ago
[deleted]
voxl|2 months ago
But see you neglect something important: it's the programmer that is establishing the rules of the game, and as Grothendieck taught us already, often just setting up the game is ALL of the work, and the proof is trivial.
robrenaud|2 months ago
Because AlphaGo can only do one.
AI could very well be better at formal theorem proving than fields medalists pretty soon. It will not have taste, ability to see the beauty in math, or pick problems and set directions for the field. But given a well specified problem, it can bruteforce search through lean tactics space at an extremely superhuman pace. What is lacks in intuition and brilliance, it will make up in being able to explore in parallel.
There is a quality/quantity tradeoff in search with a verifier. A superhuman artificial theorem prover can be generating much worse ideas on average than a top mathematician, and make up for it by trying many more of them.
It's Kasparov vs DeepBlue and Sedol vs AlphaGo all over.
It's also nowhere near AGI. Embodiment and the real world is super messy. See Moravec's paradox.
Practical programs deal with the outside world, they are underspecified, their utility depends on the changing whims of people. The formal specification of a math problem is self contained.