top | item 46941029

(no title)

accelbred | 21 days ago

I think you are entirely missing the author's point. The author is generalizing from the specific technicalities of C/Rust/etc UB, to the problem with UB which is that should it be triggered, then you can't know what the program will do. This does not have to be the result of language specification. If writing safe Rust yourself, yes no UB will occur usually, and you can know what will happen based off of what code you wrote. The author extends UB to vibecoding where there is no specification to understand the translation of prompts to code. Without thorough review, you are unable to be sure that the output code matches the intent of your prompting, which is analagous to writing code with UB. The issue the author has with vibecoded Rust is not that the code can trigger undefined behavior at the language layer, but that the perfectly "safe" code generated may not at all match the intended semantics.

discuss

AlotOfReading|21 days ago

The problem with the author's argument is the inductions don't follow from the premise. With defined C, you can in principle look at a piece of code and know what it will do in the abstract machine (or at least build a model dependent on assumptions about things like unspecified behavior). Actually doing this may be practically impossible, but that's not the point. It's not possible in the presence of UB. You can't know what a piece of code containing UB will do, even in principle.

You can in principle read the LLM's output and know that it won't put your credentials on the net, so it's not the same as UB. Maybe there are practical similarities to UB in how LLM bugs present, but I'm not sure it's a useful comparison and it's not the argument the author made.

pseudohadamard|19 days ago

The practical impossibility is a real issue, see the recent post about booleans in Doom where the author knew what the problem was, where it was, and what it was, but after reading through the standards still couldn't really find where in the standards it was forbidden, eventually saying "it's probably this bit because I can't find anything else that fits".

And when the author of the current post says:

  Turn on all linting, all warnings,

this doesn't help. I've seen code compiled with -Wall -Wextra -Wtf that produces zero warnings but for which gcc happily outputs code that segfaults, crashes, or otherwise breaks catastrophically when run. So the compiler is saying "I've found UB here, I'm not going to say anything despite maximum warnings being turned on, I'm just going to output code that I know will fail when run".

lelanthran|20 days ago

> The problem with the author's argument is the inductions don't follow from the premise.

That's possible. No one ever accused me of sound arguments :-)

I would still like to address your comment anyway.

Lets call this assertion #1:

> With defined C, you can in principle look at a piece of code and know what it will do in the abstract machine ... It's not possible in the presence of UB.

And lets call this assertion #2:

> You can in principle read the LLM's output and know that it won't put your credentials on the net, so it's not the same as UB.

With Assertion #1 you state you are not examining the output of the compiler, you are only examining the input (i.e. the source code).

With Assertion #2, you state you are examining the output of the LLM, and you are not examining the input.

IOW, these two actions are not comparable because in one you examine only the input while in the other you examine only the output.

In short: you are comparing analysing the input in one case with analysing the output in another case.

For the case of accidentally doing $FOO when trying to do $BAR:

1. No amount of input-analysis on LLM prompts will ever reveal to you if it generated code that will do $FOO - you have to analyse the output. There is a zero percent chance that examining the prompt "Do $BAR" will reveal to the examiner that their credentials will be leaked by the generated code.

2. There is a large number of automated input-analysis for C that will catch a large number of UB that prevents $FOO, when the code implements "Do $BAR". Additionally, while a lot of UB gets through, a great deal are actually caught during review.

Think of the case: "I wrote code to add two numbers, but UB caused files to get deleted off my computer"

In C, this was always possible (and C programmers acted accordingly). In Java, C#, Rust, etc this was never possible. Unless your code was generated by an LLM.

nrds|20 days ago

If that's the author's point then the article needs a rewrite. I suspect that was _not_ the author's point and it's offered as a good faith but misplaced post-hoc justification.

lelanthran|20 days ago

>> Without thorough review, you are unable to be sure that the output code matches the intent of your prompting, which is analagous to writing code with UB.

> If that's the author's point then the article needs a rewrite. I suspect that was _not_ the author's point and it's offered as a good faith but misplaced post-hoc justification.

I am the author (thanks for giving some of your valuable attention to my post; much appreciated :-), and I can confirm that the `>> ...` quoted bit above is my point, and this bit of my blog-post is where I made that specific point

> As of today 2, there is a large and persistent drive to not just incorporate LLM assistance into coding, but to (in the words of the pro-LLM-coding group) “Move to a higher level of abstraction”.

> What this means is that the AI writes the code for you, you “review” (or not, as stated by Microsoft, Anthropic, etc), and then push to prod.

> Brilliant! Now EVERY language can exhibit UB.

Okay, fair enough, I'm not the worlds best writer, but I thought that bit was pretty clear when I wrote it. I still think it's clear. Especially the "Now EVERY language can exhibit UB" bit.

I'm now half inclined to paste the entire blog into a ChatAI somewhere and see what it thinks my conclusion is...