(no title)
flatfinger | 5 years ago
I would have no qualm with the way clang and gcc process various constructs if they were to explicitly state that its maintainers make no effort to make their optimizer suitable for any tasks involving the receipt of untrustworthy input. Instead, however, they claim that their optimizers are suitable for general-purpose use, despite the fact that their behavior isn't reliably suitable for many common purposes.
> This is actually completely unneeded, even for optimisation.
Consider the following function:
unsigned long long test(unsigned long long x, int mode)
{
do
x = slow_function_no_side_effects();
while(x > 1);
if (mode)
return 1;
else
return x;
}
Suppose the function is passed a value of "x" which would get caught in a cycle that never hits zero or one, but "mode" is 1. If the code is processed by performing every individual step in order, the function would never return. The rule in C11 is designed to avoid requiring that generated code compute the value of x when its only possible effect on the program's execution would be to prevent the execution of code that doesn't depend on its value.Suppose the most important requirement that function test() must meet is that it must never return 1 unless mode is 1, or the iteration on x would yield 1 before it yields zero; returning 1 in any other cases would cause the computer's speaker to start playing Barney's "I love you" song, and while looping endlessly would be irksome, it would be less bad than Barney's singing. If a compiler determines that slow_function_no_side_effects() will never return an even number, should it be entitled to generate code that will return 1 when mode is zero, without regard for whether the loop actually completes?
I would think it reasonable for a compiler to defer/skip the computation of x in cases where mode is 1, or for a compiler that can tell that "x" will never be an even number to generate code that, after ensuring that the loop will actually terminate, would unconditionally return 1. Requiring that the programmer write extra code to ensure that the function not return 1 in cases where mode is zero but the loop doesn't terminate would defeat the purpose of "optimization".
a1369209993|5 years ago
flatfinger|5 years ago
In a broader sense, I think the problem is that the authors of the Standard have latched onto the idea that optimizations must not be observable unless a program invokes Undefined Behavior, and consequently any action that would make the effects of an optimization visible must be characterized as UB.
I think it would be far more useful to recognize that optimizations may, on an opt-in or opt-out basis, be allowed to do various things whose effects would be observable, and correct programs that would allow such optimizations must work correctly for any possible combination of effects. Consider the function:
Should the behavior of test2() be defined despite the fact that `temp` is not fully written before it is copied to `x` and `y`? What if anything should be guaranteed about the values of `x.a[2..99]`, `y.a[2..99]`, and `z.a[2..99]`?While I would allow programmer to include directives mandating more precise behavior or allowing less precise behavior, I think the most useful set of behavioral guarantees would allow those elements of `x` and `y` to hold arbitrarily different values, but that `x` and `z` would match. My rationale would be that a programmer who sees `x` and `y` assigned from `temp` would be able to see where `temp` was created, and would be able to see that some parts of it might not have been written. If the programmer cared about ensuring that the parts of `x` and `y` corresponding to the unwritten parts matched, there would be many ways of doing that. If the programmer fails to do any of those things, it's likely because the programmer doesn't care about those values.
The programmer of function `test2()`, however, would generally have no way of knowing whether any part of `x` might hold something that won't behave as some possibly-meaningless number. Further, there's no practical way that the author of `test2` could ensure anything about the parts of `x` corresponding to parts of `temp` that don't be written. Thus, a compiler should not make any assumptions about whether a programmer cares about whether `z.a[2..99]` match `x.a[2..99]`.
A compiler's decision to optimize out assignments to `x[2..99]` and `y[2..99]` may be observable, but if code would not, in fact, care about whether `x[2..99]` and `y[2..99]` match, the fact that the optimization may cause the arrays to hold different Unspecified values should not affect any other aspect of program execution.