Cool. If I understand the LLVM code correctly, it's inserting the following instruction sequence into the code:
mov r11, [cookie]
xor r11, [rsp]
...
xor r11, [rsp]
cmp r11, [cookie]
jeq 2
int 3
int 3
ret
(where r11 might be some other suitable temp register as needed). cookie points to an 8-byte chunk of .openbsd.randomdata, a section that is initialized at binary load time by the kernel to contain random data. The canary is one of 4000 possible values, named "__retguard_0" through "__retguard_3999", presumably to avoid having the kernel generate an unbounded amount of random data - the section is limited to 1MB in size.
This makes ret instructions fairly hard to use for rop purposes. Unlike the original design, which xor'd [rsp] directly, this new approach preserves return prediction so it should have a lesser effect on performance. With the changes to reduce polymorphic gadgets in place, this should make ROP attacks significantly less palatable. Also, in the original design, an arbitrary leak made rop attacks feasible as you could just place xor-encrypted return addresses on the stack. With the new design, you need repeatable register control too, assuming the temp register isn't spilled, which raises the bar quite a bit.
It's also worth mentioning there's previous work to reduce the amount of polymorphic gadgets in the instruction stream, including a new framework for clang:
The underlying assumption is that the XORd value cannot be crafted by the attacker. If read-access of [rsp] is possible, this scheme is vulnerable to information leak of the return address (or another program pointer which lets the attacker derive the return address), because cookie = retaddr^[rsp].
But still this is better than the original RETGUARD which was easier to attack in both ways, either a leak of the stack position to get the return address or vice versa.
In exploit development parlance, a gadget is a block of assembly instructions executed outside their intended order by an attacker-induced control transfer. A gadget might start in the middle of a basic block, for instance, and be invoked when an attacker uses a memory corruption vulnerability to overwrite a function pointer with an address they control.
"Return oriented programming", which is kind of a dumb name, is the idea of harvesting gadgets from the text of a program and then using them as primitives for a new program. Gadgets are stitched together by the "return" instruction (hence the name ROP). When used by attackers this way, "ret" isn't really "returning" so much as it's being used as an arbitrary indirect jump mechanism.
This uses OpenBSD's random-data memory [0][1] feature, which was used by the stack protector to provide per shared object cookies.
RETGUARD is more than just an improved stack protector, as explained in the commit message it protects function epilogues that are close to return instructions.
this is basically the xor canary approach originally pioneered by the Stackguard guys (i'm pretty sure you were already around at the time though probably forgot such old history as did the rest of the world apparently ;). the OpenBSD implementation suffers from a few problems, mostly their own making:
1. if they can't find a register to load the cookie into, they'll silently skip instrumentation (i'm not sure how that would happen in practice but the silent treatment when omitting a security feature is a non-starter).
2. if they can find such a register then it'll be spilled to the stack and restored in the epilogue, so a normal buffer overflow can control both the xor'd retaddr and the retaddr itself and the only thing standing in the way of exploitation is the secret cookie value - not unlike with Stackguard/SSP.
3. one would think that a per-function cookie is an improvement but... they're shared among threads (in userland) or everything (in the kernel) so infoleaks are just as catastrophic as before (it'd certainly help if someone described a proper threat model for this defense). at least the kernel side should use a per-syscall cookie to make it somewhat resemble an actual defense mechanism (and there's some more described in my presentation).
4. the int3 stuffing before retn must be someone's joke 'cos it sure as hell won't prevent abusing the retn as a gadget. it does introduce a mispredicted branch for every single function return however.
Anyone seen this new ReturnProtectorPass also upstream at clang? It was written end of 2017 AFAIR, that when when we last discussed this here.
CFI still looks more promising to me though. Protecting the CALL part. But this is better than the old gcc/clang stack cookie of course, protecting RET.
It depends on the function. Many things will contribute. If your CPU can keep the cookie in cache then loading it repeatedly will relatively fast compared to hitting main memory. If your branch predictor can figure out the jmp over the int3 instructions quickly then that will also be fast. If the function is very short then the retguard stuff will add relatively more instructions to the function so will have a larger impact than if the function was long, etc.
I found that the runtime overhead was about 2% on average, but there are many factors that contribute.
I actually wasn't clear at all. To me, it's more about code browseability, and I've found it much easier to splunk through codebases for fun in a GitHub-like interface. (Not much to do with the developers' workflows, I suppose)
That's really where it's coming from, wasn't really trying to be snarky or anything.
It works for them. Changing would be a lot of work for little benefit.
Note that there is a mirror on github at https://github.com/openbsd and to my understanding, developers who prefer git use that, but the official source tree is in CVS.
It would certainly jive with John Gilmore's story on how the NSA worked through the standard bodies to keep IPSEC easily exploitable by making the design too difficult to implement properly:
Their behavior around Simon & Speck and how they refused to reveal details on how exploitable they could be also seems to be similar to their previous tactics.
This is why it's worrisome that Google intends to implement Speck in Android and have pushed it to the Linux kernel, too.
Don't have all my sources on hand, but the last time I looked in to this the general conclusion I came to was that there's evidence to suggest that someone was in fact paid to put vulnerabilities into the IPSec stack of OpenBSD. But there was no evidence to suggest that those vulnerabilities ever got written or if they were written that they ever made it into the source tree.
I believe OpenBSD conducted an audit of their tree when rumours of an IPSec backdoor started and didn't find anything alarming.
The issue has been discussed many times on HN. My guess is that people don't want to revisit it (and I don't know enough off the top of my head to write a good answer). Look at HN history and you can find much of what you need.
[+] [-] nneonneo|7 years ago|reply
This makes ret instructions fairly hard to use for rop purposes. Unlike the original design, which xor'd [rsp] directly, this new approach preserves return prediction so it should have a lesser effect on performance. With the changes to reduce polymorphic gadgets in place, this should make ROP attacks significantly less palatable. Also, in the original design, an arbitrary leak made rop attacks feasible as you could just place xor-encrypted return addresses on the stack. With the new design, you need repeatable register control too, assuming the temp register isn't spilled, which raises the bar quite a bit.
[+] [-] Annatar|7 years ago|reply
[+] [-] brynet|7 years ago|reply
https://marc.info/?l=openbsd-cvs&m=151123332419798&w=2
https://marc.info/?l=openbsd-cvs&m=152495643720502&w=2
[+] [-] amenghra|7 years ago|reply
[+] [-] brynet|7 years ago|reply
Github mirror, for easier review: https://github.com/openbsd/src/commit/e688c2b0648a80551cf735...
[+] [-] Klasiaster|7 years ago|reply
[+] [-] kernoble|7 years ago|reply
[+] [-] tptacek|7 years ago|reply
"Return oriented programming", which is kind of a dumb name, is the idea of harvesting gadgets from the text of a program and then using them as primitives for a new program. Gadgets are stitched together by the "return" instruction (hence the name ROP). When used by attackers this way, "ret" isn't really "returning" so much as it's being used as an arbitrary indirect jump mechanism.
[+] [-] deegles|7 years ago|reply
[+] [-] DeepYogurt|7 years ago|reply
[+] [-] trillic|7 years ago|reply
[+] [-] brynet|7 years ago|reply
RETGUARD is more than just an improved stack protector, as explained in the commit message it protects function epilogues that are close to return instructions.
[0] https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/lib...
[1] https://www.openbsd.org/innovations.html
[+] [-] brandmeyer|7 years ago|reply
(I started writing a more detailed reply based on the commit description, but there was too much speculation without seeing their source code).
[+] [-] tptacek|7 years ago|reply
https://pax.grsecurity.net/docs/PaXTeam-H2HC15-RAP-RIP-ROP.p...
[+] [-] PaXTeam|7 years ago|reply
1. if they can't find a register to load the cookie into, they'll silently skip instrumentation (i'm not sure how that would happen in practice but the silent treatment when omitting a security feature is a non-starter).
2. if they can find such a register then it'll be spilled to the stack and restored in the epilogue, so a normal buffer overflow can control both the xor'd retaddr and the retaddr itself and the only thing standing in the way of exploitation is the secret cookie value - not unlike with Stackguard/SSP.
3. one would think that a per-function cookie is an improvement but... they're shared among threads (in userland) or everything (in the kernel) so infoleaks are just as catastrophic as before (it'd certainly help if someone described a proper threat model for this defense). at least the kernel side should use a per-syscall cookie to make it somewhat resemble an actual defense mechanism (and there's some more described in my presentation).
4. the int3 stuffing before retn must be someone's joke 'cos it sure as hell won't prevent abusing the retn as a gadget. it does introduce a mispredicted branch for every single function return however.
[+] [-] ectospheno|7 years ago|reply
Just leaving these here...
https://www.theregister.co.uk/2017/06/26/linus_torvalds_slam...
https://www.theregister.co.uk/2018/01/19/grsecurity_libel_ap...
[+] [-] rurban|7 years ago|reply
CFI still looks more promising to me though. Protecting the CALL part. But this is better than the old gcc/clang stack cookie of course, protecting RET.
[+] [-] keyle|7 years ago|reply
Dumb question (potentially): will this make code that is not inlined, calling a function many times, often, run a lot slower?
[+] [-] Mordak|7 years ago|reply
I found that the runtime overhead was about 2% on average, but there are many factors that contribute.
[+] [-] Annatar|7 years ago|reply
[+] [-] atonse|7 years ago|reply
[+] [-] davisr|7 years ago|reply
[+] [-] brynet|7 years ago|reply
[+] [-] atonse|7 years ago|reply
That's really where it's coming from, wasn't really trying to be snarky or anything.
[+] [-] ams6110|7 years ago|reply
Note that there is a mirror on github at https://github.com/openbsd and to my understanding, developers who prefer git use that, but the official source tree is in CVS.
[+] [-] busterarm|7 years ago|reply
[+] [-] w8rbt|7 years ago|reply
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] brynet|7 years ago|reply
[+] [-] mtgx|7 years ago|reply
https://arstechnica.com/information-technology/2010/12/fbi-a...
It would certainly jive with John Gilmore's story on how the NSA worked through the standard bodies to keep IPSEC easily exploitable by making the design too difficult to implement properly:
https://www.mail-archive.com/[email protected]/msg12...
Their behavior around Simon & Speck and how they refused to reveal details on how exploitable they could be also seems to be similar to their previous tactics.
This is why it's worrisome that Google intends to implement Speck in Android and have pushed it to the Linux kernel, too.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
More details on how the NSA has been sabotaging open source projects here:
https://www.youtube.com/watch?v=fwcl17Q0bpk
[+] [-] admax88q|7 years ago|reply
I believe OpenBSD conducted an audit of their tree when rumours of an IPSec backdoor started and didn't find anything alarming.
[+] [-] forapurpose|7 years ago|reply