top | item 46538259

(no title)

BalinKing | 1 month ago

The file SSOT.lean is completely trivial, I think: Unfolding the definitions in the theorems, they say nothing but "x=1 => x=1", "x=1 => x≤1", and "x≠1 => x=0 ∨ x>1" (where x is a natural number). Basically, there's no actual proof here, at least not in that file....

This is indeed the danger of letting LLMs manage a "proof" end-to-end—they can just pick the wrong definitions and theorems to prove, and even though the proofs they then give will be formally sound, they won't prove anything useful.

discuss

trissim|1 month ago

You only read the 37 lines in SSOT.lean and stopped. It's the entry point that defines DOF=1 so other files can import it. The actual proofs are in Foundations.lean (364 lines - timing trichotomy, causality), Requirements.lean (derives the two necessary language features), Completeness.lean (mechanism exhaustiveness), Derivation.lean (the uniqueness proof that achieves_ssot m = true iff m = source_hooks), Coherence.lean, CaseStudies.lean, LangPython.lean, LangRust.lean etc.

~2k lines total across the lean files. Zero sorry. Run grep -r "sorry" paper2_ssot/proofs/ if you don't believe me.

"Unfolding the definitions they say x=1 => x=1" applies to three sanity lemmas in the scaffolding file. It's like reading __init__.py and concluding the package is empty.

BalinKing|1 month ago

See my other comment—LangRust.lean is the same way.

EDIT: Just skimmed Completeness.lean, and it looks similar—at a glance, even the 3+-line proofs are very short and look a lot like boilerplate.

BalinKing|1 month ago

I also took a look at the `LangRust.lean`, and the majority of the proofs are just `rfl` (after an `intros`)—that's a major red flag, since it means the "theorems", like those in SSOT.lean, are true just by unfolding definitions. In general, that's basically never true of any interesting fact in programming languages (or math in general); on the contrary, it takes a lot of tedious work even to prove simple things in Lean.

trissim|1 month ago

Yes, many proofs are rfl. That's because we're doing engineering formalization, not pure math. The work is in getting the definitions right. Once you've correctly modeled Rust's compilation phases, item sources, and erasure semantics, the theorem that "RuntimeItem has no source field, therefore query_source returns none" should be rfl. That's the point.

The hard part isn't the proof tactics. The hard part is:

- Correctly modeling Rust's macro expansion semantics from the language reference

- Defining the compilation phases and when information is erased

- Structuring the types so that the impossibility is structural (RuntimeItem literally doesn't have a source field) If the theorems required 500 lines of tactic proofs, that would mean our model was wrong or overcomplicated. When you nail the definitions, rfl is the proof.

Compare to software verification: when you prove a sorting algorithm correct, the hard work is the loop invariants and the model, not the final QED. Tedious proof steps usually indicate you're fighting your abstractions.

The real question isn't "are the proofs short?" It's "can you attack the definitions?" The model claims RuntimeItem erases source info at compile-to-runtime. Either produce Rust code where RuntimeItem retains its macro provenance at runtime, or accept the model is correct. The rfl follows from the model being right.