(no title)
upghost | 9 days ago
I realize formal verification with lean is a slightly different game but if anyone here has any insight, I tend to be extremely nervous about a confidently presented AI "proof" because I am sure that the proof is proving whatever it is proving, but it's still very hard for me to be confident that it is proving what I need it to prove.
Before the dog piling starts, I'm talking specifically about distributed systems scenarios where it is just not possible for a human to think through all the combinatorics of the liveness and safety properties without proof assistance.
I'm open to being wrong on this, but I think the skill of writing a proof and understanding the proof is different than being sure it actually proves for all the guarantees you have in mind.
I feel like closing this gap is make it or break it for using AI augmented proof assistance.
oggy|9 days ago
But it's still usually possible to distill a few crucial properties that can be specified in an "obviously correct" manner. It takes A LOT of work (sometimes I'd be stuck for a couple of weeks trying to formalize a property). But in my experience the trade off can be worth it. One obvious benefit is that bugs can be pricey, depending on the system. But another benefit is that, even without formal verification, having a few clear properties can make it much easier to write a correct system, but crucially also make it easier to maintain the system as time goes by.
awesomeMilou|8 days ago
daxfohl|9 days ago
There was a post a few months ago demonstrating this for various "proved" implementations of leftpad: https://news.ycombinator.com/item?id=45492274
This isn't to say it's useless; sometimes it helps you think about the problem more concretely and document it using known standards. But I'm not super bullish on "proofs" being the thing that keeps AI in line. First, like I said, they're easy to specify incorrectly, and second, they become incredibly hard to prove beyond a certain level of complexity. But I'll be interested to watch the space evolve.
(Note I'm bullish on AI+Lean for math. It's just the "provably safe AI" or "provably correct PRs" that I'm more skeptical of).
fauigerzigerk|9 days ago
But do we have anything that works better than some form of formal specification?
We have to tell the AI what to do and we have to check whether it has done that. The only way to achieve that is for a person who knows the full context of the business problem and feels a social/legal/moral obligation not to cheat to write a formal spec.
nextos|9 days ago
It's less powerful, but easier to break down and align with code. Dafny and F* are two good showcases. Less power makes it also faster to verify and iterate on.
deterministic|7 days ago
Using LEAN or Coq requires you to basically convert your code to LEAN/Coq before you can start proving anything. And importing some complicated Hoare logic library. While proving things correct in Dafny (for example) feels much more like programming.
johnbender|9 days ago
In the case of digital systems it can be much worse because we often have to include many assumptions to accommodate the complexity of our models. To use an example from your context, usually one is required to assume some kind of fairness to get anything to go through with systems operating concurrently but many kinds of fairness are not realistic (eg strong fairness).
esafak|9 days ago
youknownothing|9 days ago