(no title)
obl | 2 years ago
int main(){
int x;
if (x <= 42){
assert(x != 12345);
}
}
is of course UB in C, even though under a "uninitialized is random" model the program is valid and does not assert (as the model checker concludes).(even in O1 clang gets rid of the whole function, including even the ret instruction, I'm surprised it does not at least leave an ud2 for an empty function to help debugging since it would not cost anything https://godbolt.org/z/eK8cz3EPe )
nanolith|2 years ago
This isn't perfect, of course. In this case, the compiler can rightly treat x as uninitialized, meaning that its value could be <= 42 at one point, and not be <= 42 at another point. Since the uninitialized variable isn't "pinned", it could technically be different values in different locations.
CBMC's machine model works differently. In this case, x is assigned a non-deterministic value. The branch condition creates a refinement of this value within the scope of that statement. If the branch is taken, then by SMT rules, x can't equal 12345, because it was already refined as being <= 42.
On its own, a model checker can miss situations like these. It's why I recommend -Wall -Werror -Wpedantic in conjunction with CBMC. The compiler should catch this as a warning, and it should be upgraded as an error.
zzo38computer|2 years ago
LLVM has a "freeze" command to stop propagation of undefined values (although I think that command was added later than the first version), so that the value is "pinned" as you say. However, the "undef" command, if you do not use "freeze", will not do this.
I think that the C compiler should "pin" such undefined values where they are used, but I don't know which compilers have an option to do this. (Perhaps CBMC should also have such a switch, so that you can use the same options that you will use with the C compiler.)
With this ex.3 file, the optimizer should be allowed to result in a function that does nothing and has an unspecified return value (which might or might not be the same each time it is executed). (If optimizations are disabled, then it should actually compile the conditional branch instruction.)
kazinator|2 years ago
The reasoning about nondeterministic values should be spared for situations when it's other than undefined behavior. For instance, accessing an uninitialized structure member that came from malloc isn't undefined behavior. It's (I think) unspecified behavior. In an implementation that has trap representations for some types, it could hit a trap.
magicalhippo|2 years ago
Brian_K_White|2 years ago
I don't think accepting the random initial value means that the value may continue to change until the first time you set a value.
fweimer|2 years ago
So the checker treats them as defined, but with an unknown variable. You could have written this instead:
And leaving unknown_int_value undefined (so it's not visible to the analyzer). Or write a function and use x as a parameter.I suspect CBMC does this to have a convenient syntax for this frequent scenario. Apparently, it's used quite often, as in these examples: https://model-checking.github.io/cbmc-training/cbmc/overview...
It seems that CBMC is not intended to check production sources directly against C semantics, but to prove things about programs written in a C-like syntax.
philzook|2 years ago
jcranmer|2 years ago
Any invocation of undefined behavior should be considered an assertion failure as far as any model checker should be concerned. Compilers can--and will--treat undefined behavior as license to alter the semantics of your program without any constraint, and most instances of undefined behavior are clearly programmer error (there is no good reason to read uninitialized memory, for example).
Reading an uninitialized variable is not "subtle" undefined behavior. It's one of the most readily accessible examples not only of what undefined behavior can exist, but also the ways compilers will mutilate your code just because you did it. To be honest, if something as simple as the consequences of reading uninitialized memory are shaky understanding for someone trying to prove code correct, that will completely undermine any trust I have in the validity of your proofs.
philzook|2 years ago
- UB sanitizer https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
- Cerberus Semantics https://www.cl.cam.ac.uk/~pes20/cerberus/
- https://github.com/kframework/c-semantics
- https://github.com/TrustInSoft/tis-interpreter
philzook|2 years ago