top | item 44492986

My first verified imperative program

181 points| TwoFx | 7 months ago |markushimmel.de

93 comments

order

Joker_vD|7 months ago

Naturally, this proof only works for arbitrary-precision integers: when you use fixed-precision integers, the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1] or (if you insist on C semantics) [UINT_MAX, 1].

Hopefully the proof would break if one tried to transfer it over?

DavidVoid|7 months ago

> the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1]

If you have INT_MIN along with any other negative number in the array then your program has undefined behavior in C. Signed integer overflow is UB (but unsigned overflow is not).

Jtsummers|7 months ago

> the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1]

`INT_MIN + -1` is not 0 so it should report false in that case.

For UINT_MAX, the algorithm would need to be reconsidered, though, since it's written with signed integers in mind.

> Hopefully the proof would break if one tried to transfer it over?

Hopefully. The proof would have to be modified to account for the actual types. If you're using bounded integers you'd need to write a different proof.

andrepd|7 months ago

But false is the correct result for those cases. Addition is addition and overflow is undefined (= can assume that doesn't happen), it's not addition modulo 2^n.

ethan_smith|7 months ago

Good point about overflow - Lean can actually model machine integers with bounded arithmetic operations, allowing you to formally verify these edge cases by explicitly reasoning about overflow behavior.

zelphirkalt|7 months ago

Wait, so much effort and it doesn't even consider this widely known issue? That would mean, that even though all this effort has been spent, a decent programmer still has a better idea of whether something is correct than the proof system used here. And worse this might lull one into thinking, that it must be correct, while actually for a simple case it breaks.

necunpri|7 months ago

This is the strength of typing, right?

If I can specify the type of my input I can ensure the verification.

munchler|7 months ago

Lean is awesome and this is an impressive new feature, but I can't help but notice that the proof is significantly longer and more complex than the program itself. I wonder how well this will scale to real-world programs.

armchairhacker|7 months ago

Real-world programs can be verified by formally proving properties on a small part of the code (called the kernel) in a way that transitively guarantees those for the remaining code.

For example, Rust's borrow checker guarantees* memory safety of any code written in Rust, even a 10M+ LOC project. Another example is sel4, a formally-verified micro-kernel (https://sel4.systems/About/seL4-whitepaper.pdf).

* Technically not; even if the code doesn't use `unsafe`, not only is Rust's borrow checker not formally verified, there are soundness holes (https://github.com/rust-lang/rust/issues?q=is%3Aopen%20is%3A...). However, in theory it's possible to formally prove that a subset of Rust can only encode memory-safe programs, and in practice Rust's borrow checker is so effective that a 10M+ LOC project without unsafe still probably won't have memory issues.

amw-zero|7 months ago

Research points to there being a quadratic relationship between automated proof and code size: https://trustworthy.systems/publications/nictaabstracts/Mati....

Specifically, the relationship is between the _specification_ and the proof, and it was done for proofs written in Isabelle and not Lean.

The good news is that more and more automation is possible for proofs, so the effort to produce each proof line will likely go down over time. Still, the largest full program we've fully verified is much less than 100,000 LOC. seL4 (verified operating system) is around 10,000 lines IIRC.

rtpg|7 months ago

My belief is that the core part of a lot of programs where you want formal proofs are tricky enough to where you need hand holding but the higher up the stack you go the simpler your proofs get since you did the hard stuff in the core.

Though it really does feel like we're still scratching the surface of proof writing practices. A lot of proofs I've seen seem to rely only on very low level building blocks, but stronger practitioners more immediately grab tools that make stuff simpler.

I would say, though, that it feels likely that your proofs are always going to be at least within an order of magnitude of your code, because in theory the longer your code is the more decision points there are to bring up in your proof as well. Though automatic proof searches might play out well for you on simpler proofs.

grumbelbart|7 months ago

Long-term this would be done using LLMs. It would also solve LLMs' code quality issues - they could simply proof that the code works right.

ryjo|7 months ago

Very cool. Neat how you managed to get logical symbols in to the language itself! When might someone use preconditions in Lean theorems?

This article caught my eye because it's focused on imperative programming, and I've been very focused on declarative vs imperative programming over the last few years. I implemented a version of your function in CLIPS, a Rules-based language that takes a declarative approach to code:

(defrule sum-is-0 (list $? ?first $? ?second $?) (test (= 0 (+ ?first ?second))) => (println TRUE))

(defrule sum-is-not-0 (not (and (list $? ?first $? ?second $?) (test (= 0 (+ ?first ?second))))) => (println FALSE))

(assert (list 1 0 2 -1)) (run) (exit)

The theorem you write in Lean to prove the function kind-of exists in CLIPS Rules; you define the conditions that must occur in order to execute the Right Hand Side of the Rule. Note that the above simply prints `TRUE` or `FALSE`; it is possible to write imperative `deffunction`s that return values in CLIPS, but I wanted to see if I could draw parallels for myself between Lean code and theorems. Here's a gist with the simple version and a slightly more robust version that describes the index at which the matching numbers appear: https://gist.github.com/mrryanjohnston/680deaee87533dfedc74b...

Thank you for writing this and for your work on Lean! This is a concept that's been circling in my head for a minute now, and I feel like this article has unlocked some level of understanding I was missing before.

drdeca|7 months ago

Very nice. However, I wonder whether it might be good to have a way to tell LEAN to spit out a more explicit form of the proof steps it obtained during `grind`? Like, to produce text for what one would put in there if one was doing it manually, that would work in place of grind, in case the grind step is slow to verify?

yuppiemephisto|7 months ago

`show_term grind` and `by grind?` should do what you want

jeremyscanvic|7 months ago

That's really neat! I'm very excited for the future of Lean.

norir|7 months ago

My brain has been slowly trained to reject imperative programming. This example could be rewritten in a tail recursive manner using an immutable set which would be simpler to verify for correctness even without a formal verifier.

I have found that while there is a learning curve to programming using only recursion for looping, code quality does go significantly up under this restriction.

Here is why I personally think tail recursion is better than looping: with tail recursion, you are forced to explicitly reenter the loop. Right off the bat, this makes it difficult to inadvertently write an infinite loop. The early exit problem is also eliminated because you just return instead of making a recursive call. Moreover, using recursion generally forces you to name the function that loops which gives more documentation than a generic for construct. A halfway decent compiler can also easily detect tail recursion and rewrite it as a loop (and inline if the recursive function is only used in one place) so there need not to be any runtime performance cost of tail recursion instead of looping.

Unfortunately many languages do not support tail call optimization or nested function definitions and also have excessively wordy function definition syntax which makes loops more convenient to write in those languages. This conditions one to think in loops rather than tail recursion. Personally I think Lean would be better if it didn't give in and support imperative code and instead helped users learn how to think recursively instead.

taeric|7 months ago

This feels overly strong? I've certainly messed up my fair share of recursive calls.

I don't know why, but I have actually gotten a bit stronger on the imperative divide in recent years. To the point that I found writing, basically, a GOTO based implementation of an idea in lisp to be easier than trying to do it using either loops or recursion. Which, really surprised me.

I /think/ a lot of the difference comes down to how localized the thinking is. If I'm able to shrink the impact of what I want to do down to a few arguments, then recursion helps a ton. If I'm describing a constrained set of repetitive actions, loops. If I'm trying to hold things somewhat static as I perform different reduction and such, GOTO works.

I think "functional" gets a bit of a massive boost by advocates that a lot of functional is presented as declarative. But that doesn't have to be the case. Nor can that help you, if someone else hasn't done the actual implementation.

We can get a long way with very mechanical transformations, in the form of compilation. But the thinking can still have some very imperative aspects.

tsimionescu|7 months ago

I think you're in a tiny minority if you think it's easier to read and understand algorithms written using recursive tail calls than imperative loops. Even outside of programming, take a look at most work being done in algorithm research - you'll see that most often algorithms are described in an imperative pseudo-code, and using a mix of loops and regular recursion (not tail recursion), with maybe some mapping and filtering constructs in addition.

Tail recursion in particualr is the least human-friendly way to represent looping. It mixes input and output parameters of the function in any but the most trivial cases. It also forces the caller to figure out what is the base value for all of the output parameters (often requiring a separate function just to provide those to callers). And it basically takes an implementation detail and makes it a part of your function interface.

Even in math, you typically define recursive functions like f(x) = 1 + f(x-1), not f(x, y) = f(x-1, y+1); g(x) = f(x, 0).

ngruhn|7 months ago

Another case for recursion is that you have to think of the base case. With loops people pathologically forget to handle 0, [], [[]], "", etc.

buzzin_|7 months ago

I have found it that if you invest some time in learning how to write quality for loops, the quality indeed goes up.

Also, when writing for loops, you have to explicitly think about your exit condition for the loop, and it is visible right there, at the top of the loop, making infinite loops almost impossible.

immibis|7 months ago

Computers are imperative. With imperative programming you can get better performance than with pure functional. (Using pure functional to wrap an imperative core still counts as imperative)

Maybe you don't care about performance; IMO squeezing performance is one of the important applications for formal verification, as you can prove your fast insane algorithm is correct, whereas the slow obvious one is obviously correct.

If your main concern is clarity, some things are clearer when written imperatively and some when written functionally.

louthy|7 months ago

I agree, but also folds, traversals, list-comprehensions, and recursion-schemes work well and can be even more resistent to common bugs than regular recursion.

Although it’s hard to fault the simple elegance of recursion!

kevindamm|7 months ago

Which languages do support TCO at this point? From my recollection we have

* Scheme

* Haskell

* Elixir

* Erlang

* OCaml

* F#

* Scala

* (not Clojure)

* the JVM could remove tail-recursive calls, but IIRC this still hasn't been added for security reasons

* Racket

* Zig

* Lua

* Common Lisp, under certain compilers/interpreters

* Rust? (depends)

* Swift? (sometimes)

bux93|7 months ago

It looks suspiciously like verifying imperative code by writing declarative code that does the same thing.

SUPERX_112|7 months ago

nice

revskill|7 months ago

U must have at least 5 characters to get an upvote.