This has been a sore point in a lot of discussions regarding compiler optimizations and cryptographic code, how compilers and compiler engineers are sabotaging the efforts of cryptographers in making sure there are no side-channels in their code. The issue has never been the compiler, and has always been the language: there was never a way to express the right intention from within C (or most other languages, really).
This primitive we're trying to introduce is meant to make up for this shortcoming without having to introduce additional rules in the standard.
There really ought to be a subset of C that lets you write portable assembly. One where only a defined set of optimisations are allowed and required to be performed, "inline" means always inline, the "register" and "auto" keywords have their original meanings, every stack variable is allocated unless otherwise indicated, every expression has defined evaluation order, every read/write from/to an address is carried out, nothing is ever reordered, and undefined behaviour is switched to machine-specific behaviour. Currently if you need that level of control, your only option is writing it in assembly, which gets painful when you need to support multiple architectures, or want fancy features like autocomplete or structs and functions.
>how compilers and compiler engineers are sabotaging the efforts of cryptographers
I'm not exposed to this space very often, so maybe you or someone else could give me some context. "Sabotage" is a deliberate effort to ruin/hinder something. Are compiler engineers deliberately hindering the efforts of cryptographers? If yes... is there a reason why? Some long-running feud or something?
Or, through the course of their efforts to make compilers faster/etc, are cryptographers just getting the "short end of the stick" so to speak? Perhaps forgotten about because the number of cryptographers is dwarfed by the number of non-cryptographers? (Or any other explanation that I'm unaware of?)
Last I saw, it seemed like the plan was to unconditionally enable it, and on the off chance there's ever a piece of hardware where it's a substantial performance win, offer a way to opt out of it.
Sorry, I may be missing the point here, but reading that page doesn’t immediately make it obvious to me what that feature is. Is it some constant time execution mechanism that you can enable / disable on a per-thread basis to do… what exactly?
These are meaningless without guarantees that the processor will run the instructions in constant time and not run the code as fast as possible. Claims like cmov on x86 always being constant time are dangerous because a microcode update could change that to not be the case anymore. Programmers want an actual guarantee that the code will take the same amount of time.
We should be asking our CPU vendors to support enabling a constant time mode of some sort for sensitive operations.
However, cooperation from the operating system is necessary, as the constant-time execution mode may need to be enabled by setting certain CPU-control bits in protected registers (e.g. IA32_UARCH_MISC_CTL[DOITM]).
That's been one of my counters to the bitch that C isn't safe. The underlying architecture isn't safe.
That said WG21 and WG14 don't seem to be able to get the memo that safety is more important than single core speed. Or as I suspect a bunch members are actually malicious.
I agree. For use cases where side channel attacks are likely to be attempted, the security of the system ultimately depends on both the software and hardware used.
So this makes me curious: is there a reason we don't do something like a __builtin_ct_begin()/__builtin_ct_end() set of intrinsics? Where the begin intrinsic begins a constant-time code region, and all code within that region must be constant-time, and that region must be ended with an end() call? I'm not too familiar with compiler intrinsics or how these things work so thought I'd ask. The intrinsic could be scoped such that the compiler can use it's implementation-defined behavior freedom to enforce the begin/end pairs. But Idk, maybe this isn't feasible?
It'd be very hard for the compiler to enforce constant-time execution for generic code. As an example, if you wrote the naive password checking where the first byte that doesn't match returns false, is that a compiler error if it can't transform it into a constant time version?
I think __builtin_ct_select and __builtin_ct_expr would be good ideas. (They could also be implemented in GCC in future, as well as LLVM.)
In some cases it might be necessary to consider the possibility of invalid memory accesses (and avoid the side-channels when doing so). (The example given in the article works around this issue, but I don't know if there are any situations where this will not help.)
The side channel from memory access timings are exactly why cmov is its own instruction on x86_64. It retrieves the memory regardless of the condition value. Anything else would change the timings based on condition. If you're going to segfault that's going to be visible to an attacker regardless because you're going to hang up.
I'd love if this could make it into Rust... but I'm wondering if it'd be a bit of a burden on the creators of alternative backends (e.g. Cranelift), since if they implemented it naively, they would be unsuitable for compiling cryptographic code using it.
> but I'm wondering if it'd be a bit of a burden on the creators of alternative backends (e.g. Cranelift)
Technically any new feature that requires backend support is an additional burden on backend devs. There's nothing special about constant-time builtins in this respect.
> since if they implemented it naively
Strictly speaking, whether an implementation is naive is independent of whether it is correct. An implementation that purports to be constant time while not actually being constant time is wrong, no matter how naive or sophisticated the implementation may be.
Disabling optimizations does not necessarily result in more deterministic execution.
With "-O0", the generated code normally retains a huge number of useless register loads and stores, which lead to non-deterministic timing due to contention in the use of caches and of the main memory interface. Optimized code may run only inside registers, being thus executed in constant time regardless of what other CPU cores do.
The only good part is that this non-deterministic timing will not normally depend on the data values. The main danger of the non-constant execution time is when this time depends on the values of the processed data, which provides information about those values.
There are cases when disabling optimization may cause data-dependent timing, e.g. if with optimization the compiler would have chosen a conditional move and without optimization it chooses a data-dependent branch.
The only certain way of achieving data-independent timing is to use either assembly language or appropriate compiler intrinsics.
frabert|3 months ago
This primitive we're trying to introduce is meant to make up for this shortcoming without having to introduce additional rules in the standard.
fanf2|3 months ago
Asooka|3 months ago
jfindper|3 months ago
I'm not exposed to this space very often, so maybe you or someone else could give me some context. "Sabotage" is a deliberate effort to ruin/hinder something. Are compiler engineers deliberately hindering the efforts of cryptographers? If yes... is there a reason why? Some long-running feud or something?
Or, through the course of their efforts to make compilers faster/etc, are cryptographers just getting the "short end of the stick" so to speak? Perhaps forgotten about because the number of cryptographers is dwarfed by the number of non-cryptographers? (Or any other explanation that I'm unaware of?)
GhosT078|3 months ago
fooker|3 months ago
Any side effect is a side channel. There are always going to be side channels in real code running on real hardware.
Sure you can change your code, compiler, or, or even hardware to account for this but at it's core that is security by obscurity.
amluto|3 months ago
https://www.intel.com/content/www/us/en/developer/articles/t...
Sure, you could run on some hypothetical OS that supports DOITM and insert syscalls around every manipulation of secret data. Yeah, right.
JoshTriplett|3 months ago
stingraycharles|3 months ago
charcircuit|3 months ago
We should be asking our CPU vendors to support enabling a constant time mode of some sort for sensitive operations.
adrian_b|3 months ago
For an example of a list of such instructions see:
https://www.intel.com/content/www/us/en/developer/articles/t...
However, cooperation from the operating system is necessary, as the constant-time execution mode may need to be enabled by setting certain CPU-control bits in protected registers (e.g. IA32_UARCH_MISC_CTL[DOITM]).
See for instance:
https://www.intel.com/content/www/us/en/developer/articles/t...
CMOV is on the list of instructions with constant-time execution, but the list is valid only with the corresponding control bit set correctly.
Gibbon1|3 months ago
That said WG21 and WG14 don't seem to be able to get the memo that safety is more important than single core speed. Or as I suspect a bunch members are actually malicious.
derangedHorse|3 months ago
ethin|3 months ago
grogers|3 months ago
zzo38computer|3 months ago
westurner|3 months ago
"Constant-Time Coding Support in LLVM: Protecting Cryptographic Code at the Compiler Level" (2025-10) PDF: https://llvm.org/devmtg/2025-10/slides/quick_talks/alexandre... :
> Circumvent Branch-base Timing Attacks
zzo38computer|3 months ago
In some cases it might be necessary to consider the possibility of invalid memory accesses (and avoid the side-channels when doing so). (The example given in the article works around this issue, but I don't know if there are any situations where this will not help.)
connicpu|3 months ago
codedokode|3 months ago
anematode|3 months ago
aw1621107|3 months ago
Technically any new feature that requires backend support is an additional burden on backend devs. There's nothing special about constant-time builtins in this respect.
> since if they implemented it naively
Strictly speaking, whether an implementation is naive is independent of whether it is correct. An implementation that purports to be constant time while not actually being constant time is wrong, no matter how naive or sophisticated the implementation may be.
rurban|3 months ago
adrian_b|3 months ago
With "-O0", the generated code normally retains a huge number of useless register loads and stores, which lead to non-deterministic timing due to contention in the use of caches and of the main memory interface. Optimized code may run only inside registers, being thus executed in constant time regardless of what other CPU cores do.
The only good part is that this non-deterministic timing will not normally depend on the data values. The main danger of the non-constant execution time is when this time depends on the values of the processed data, which provides information about those values.
There are cases when disabling optimization may cause data-dependent timing, e.g. if with optimization the compiler would have chosen a conditional move and without optimization it chooses a data-dependent branch.
The only certain way of achieving data-independent timing is to use either assembly language or appropriate compiler intrinsics.