top | item 46295778

(no title)

bkettle | 2 months ago

I think formal verification shines in areas where implementation is much more complex than the spec, like when you’re writing incomprehensible bit-level optimizations in a cryptography implementation or compiler optimization phases. I’m not sure that most of us, day-to-day, write code (or have AI write code) that would benefit from formal verification, since to me it seems like high-level programming languages are already close to a specification language. I’m not sure how much easier to read a specification format that didn’t concern itself with implementation could be, especially when we currently use all kinds of frameworks and libraries that already abstract away implementation details.

Sure, formal verification might give stronger guarantees about various levels of the stack, but I don’t think most of us care about having such strong guarantees now and I don’t think AI really introduces a need for new guarantees at that level.

discuss

order

pron|2 months ago

> to me it seems like high-level programming languages are already close to a specification language

They are not. The power of rich and succinct specification languages (like TLA+) comes from the ability to succinctly express things that cannot be efficiently computed, or at all. That is because a description of what a program does is necessarily at a higher level of abstraction than the program (i.e. there are many possible programs or even magical oracles that can do what a program does).

To give a contrived example, let's say you want to state that a particular computation terminates. To do it in a clear and concise manner, you want to express the property of termination (and prove that the computation satisfies it), but that property is not, itself, computable. There are some ways around it, but as a rule, a specification language is more convenient when it can describe things that cannot be executed.

nyrikki|2 months ago

TLA+ is not a silver bullet, and like all temporal logic, has constraints.

You really have to be able to reduce your models to: “at some point in the future, this will happen," or "it will always be true from now on”

Have probabilistic outcomes? Or even floats [0] and it becomes challenging and strings are a mess.

> Note there is not a float type. Floats have complex semantics that are extremely hard to represent. Usually you can abstract them out, but if you absolutely need floats then TLA+ is the wrong tool for the job.

TLA+ works for the problems it is suitable for, try and extend past that and it simply fails.

[0] https://learntla.com/core/operators.html

eru|2 months ago

What you said certainly works, but I'm not sure computability is actually the biggest issue here?

Have a look at how SAT solvers or Mixed Integer Linear Programming solvers are used.

There you specify a clear goal (with your code), and then you let the solvers run. You can, but you don't need to, let the solvers run all the way to optimality. And the solvers are also allowed to use all kinds of heuristics to find their answers, but that doesn't impact the statement of your objective.

Compare that to how many people write code without solvers: the objective of what your code is trying to achieve is seldom clearly spelled out, and is instead mixed up with the how-to-compute bits, including all the compromises and heuristics you make to get a reasonable runtime or to accommodate some changes in the spec your boss asked for at the last minute.

Using a solver ain't formal verification, but it shows the same separation between spec and implementation.

Another benefit of formal verification, that you already imply: your formal verification doesn't have to determine the behaviour of your software, and you can have multiple specs simultaneously. But you can only have a single implementation active at a time (even if you use a high level implementation language.)

So you can add 'handling a user request must terminate in finite time' as a (partial) spec. It's an important property, but it tells you almost nothing about the required business logic. In addition you can add "users shouldn't be able to withdraw more than they deposited" (and other more complicated rules), and you only have to review these rules once, and don't have to touch them again, even when you implement a clever new money transfer routine.

avmich|2 months ago

Peter Norvig once proposed to consider a really large grammar, with trillion rules, which could simulate some practically small applications of more complex systems. Many programs in practice don't need to be written in Turing-complete languages, and can be proven to terminate.

mrkeen|2 months ago

Can TLA+ prove anything about something you specify but don't execute?

anon-3988|2 months ago

> To give a contrived example, let's say you want to state that a particular computation terminates. To do it in a clear and concise manner, you want to express the property of termination (and prove that the computation satisfies it), but that property is not, itself, computable. There are some ways around it, but as a rule, a specification language is more convenient when it can describe things that cannot be executed.

Do you really think it is going to be easier for the average developer to write a specification for their program that does not terminate

vs

Giving them a framework or a language that does not have for loop?

Edit: If by formal verification you mean type checking. That I very much agree.

socketcluster|2 months ago

Yes. I feel like people who are trying to push software verification have never worked on typical real-world software projects where the spec is like 100 pages long and still doesn't fully cover all the requirements and you still have to read between the lines and then requirements keep changing mid-way through the project... Implementing software to meet the spec takes a very long time and then you have to invest a lot of effort and deep thought to ensure that what you've produced fits within the spec so that the stakeholder will be satisfied. You need to be a mind-reader.

It's hard even for a human who understands the full business, social and political context to disambiguate the meaning and intent of the spec; to try to express it mathematically would be an absolute nightmare... and extremely unwise. You would literally need some kind of super intelligence... And the amount of stream-of-thought tokens which would have to be generated to arrive at a correct, consistent, unambiguous formal spec is probably going to cost more than just hiring top software engineers to build the thing with 100% test coverage of all main cases and edge cases.

Worst part is; after you do all the expensive work of formal verification; you end up proving the 'correctness' of a solution that the client doesn't want.

The refactoring required will invalidate the entire proof from the beginning. We haven't even figured out the optimal way to formally architect software that is resilient to requirement changes; in fact, the industry is REALLY BAD at this. Almost nobody is even thinking about it. I am, but I sometimes feel like I may be the only person in the world who cares about designing optimal architectures to minimize line count and refactoring diff size. We'd have to solve this problem first before we even think about formal verification of 'most software'.

Without a hypothetical super-intelligence which understands everything about the world; the risk of misinterpreting any given 'typical' requirement is almost 100%... And once we have such super-intelligence, we won't need formal verification because the super-intelligence will be able to code perfectly on the first attempt; no need to verify.

And then there's the fact that most software can tolerate bugs... If operationally important big tech software which literally has millions of concurrent users can tolerate bugs, then most software can tolerate bugs.

DennisP|2 months ago

Software verification has gotten some use for smart contracts. The code is fairly simple, it's certain to be attacked by sophisticated hackers who know the source, and the consequence of failure is theft of funds, possibly in large amounts. 100% test coverage is no guarantee that an attack can't be found.

People spend gobs of money on human security auditors who don't necessarily catch everything either, so verification easily fits in the budget. And once deployed, the code can't be changed.

Verification has also been used in embedded safety-critical code.

robot-wrangler|2 months ago

The whole perspective of this argument is hard for me to grasp. I don't think anyone is suggesting that formal specs are an alternative to code, they are just an alternative to informal specs. And actually with AI the new spin is that they aren't even a mutually exclusive alternative.

A bidirectional bridge that spans multiple representations from informal spec to semiformal spec to code seems ideal. You change the most relevant layer that you're interested in and then see updates propagating semi-automatically to other layers. I'd say the jury is out on whether this uses extra tokens or saves them, but a few things we do know. Chain of code works better than chain of thought, and chain-of-spec seems like a simple generalization. Markdown-based planning and task-tracking agent workflows work better than just YOLOing one-shot changes everywhere, and so intermediate representations are useful.

It seems to me that you can't actually get rid of specs, right? So to shoot down the idea of productive cooperation between formal methods and LLM-style AI, one really must successfully argue that informal specs are inherently better than formal ones. Or even stronger: having only informal specs is better than having informal+formal.

ad_hockey|2 months ago

I agree that trying to produce this sort of spec for the entire project is probably a fool's errand, but I still see the value for critical components of the system. Formally verifying the correctness of balance calculation from a ledger, or that database writes are always persisted to the write ahead log, for example.

qingcharles|2 months ago

I used to work adjacent to a team who worked from closely-defined specs for web sites, and it used to infuriate the living hell out of me. The specs had all sorts of horrible UI choices and bugs and stuff that just plain wouldn't work when coded. I tried my best to get them to implement the intent of the spec, not the actual spec, but they had been trained in one method only and would not deviate at any cost.

marcosdumay|2 months ago

There are many really important properties to enforce even on the most basic CRUD system. You can easily say things like "an anonymous user must never edit any data, except for the create account form", or "every user authorized to see a page must be listed on the admin page that lists what users can see a page".

People don't verify those because it's hard, not for lack of value.

bkettle|2 months ago

Yeah fair enough. I can definitely see the value of property-based verification like this and agree that useful properties could be easy to express and that LLMs could feasibly verify them. I think full verification that an implementation implements an entire spec and nothing else seems much less practical even with AI, but of course that is just one flavor of verification.

Maxion|2 months ago

Even

> "an anonymous user must never edit any data, except for the create account form"

Can quickly end up being

> "an anonymous user must never edit any data, except for the create account form, and the feedback form"

And a week later go to

> "an anonymous user must never edit any data, except for the create account form, the feedback form, and the error submission form if they end up with a specific type of error"

And then during christmas

> > "an anonymous user must never edit any data, except for the create account form, the feedback form, and the error submission form if they end up with a specific type of error, and the order submission form if they visit it from this magic link. Those visiting from the magic link, should not be able to use the feedback form (marge had a bad experience last christmas going through feedbacks from the promotional campaign)"

rmah|2 months ago

Yes, except their cookie preferences to comply with european law. Oh, and they should be able to change their theme from light/dark but only that. Oh and maybe this other thing. Except in situations where it would conflict with current sales promotions. Unless they're referred by a reseller partner. Unless it's during a demo, of course. etc, etc, etc.

This is the sort of reality that a lot of developers in the business world deals with.

amw-zero|2 months ago

Compare the spec with the application here: https://concerningquality.com/model-based-testing/

I think we've become used to the complexity in typical web applications, but there's a difference between familiar and simple (simple vs. easy, as it were). The behavior of most business software can be very simply expressed using simple data structures (sets, lists, maps) and simple logic.

No matter how much we simply it, via frameworks and libraries or whatever have you, things like serialization, persistence, asynchrony, concurrency, and performance end up complicating the implementation. Comparing this against a simpler spec is quite nice in practice - and a huge benefit is now you can consult a simple in-memory spec vs. worrying about distributed system deployments.

giancarlostoro|2 months ago

> especially when we currently use all kinds of frameworks and libraries that already abstract away implementation details.

This is my issue with algorithm driven interviewing. Even the creator of Homebrew got denied by Google because he couldn't do some binary sort or whatever it even was. He made a tool used by millions of developers, but apparently that's not good enough.

StilesCrisis|2 months ago

Google denies qualified people all the time. They would much rather reject a great hire than take a risk on accepting a mediocre one. I feel for him but it's just the nature of the beast. Not everyone will get in.

UltraSane|2 months ago

AWS has said that having formal verification of code lets them be more aggressive in optimization while being confidant it still adheres to the spec. They claim they were able to double the speed of IAM API auth code this way.