top | item 17246537

Retguard: An improved stack protector for OpenBSD

187 points| brynet | 7 years ago |marc.info | reply

69 comments

order
[+] nneonneo|7 years ago|reply
Cool. If I understand the LLVM code correctly, it's inserting the following instruction sequence into the code:

    mov r11, [cookie]
    xor r11, [rsp]
    ...
    xor r11, [rsp]
    cmp r11, [cookie]
    jeq 2
    int 3
    int 3
    ret
(where r11 might be some other suitable temp register as needed). cookie points to an 8-byte chunk of .openbsd.randomdata, a section that is initialized at binary load time by the kernel to contain random data. The canary is one of 4000 possible values, named "__retguard_0" through "__retguard_3999", presumably to avoid having the kernel generate an unbounded amount of random data - the section is limited to 1MB in size.

This makes ret instructions fairly hard to use for rop purposes. Unlike the original design, which xor'd [rsp] directly, this new approach preserves return prediction so it should have a lesser effect on performance. With the changes to reduce polymorphic gadgets in place, this should make ROP attacks significantly less palatable. Also, in the original design, an arbitrary leak made rop attacks feasible as you could just place xor-encrypted return addresses on the stack. With the new design, you need repeatable register control too, assuming the temp register isn't spilled, which raises the bar quite a bit.

[+] Annatar|7 years ago|reply
What is preventing an attacker from overwriting the last xor and cmp instruction sequence with nop instructions?
[+] Klasiaster|7 years ago|reply
The underlying assumption is that the XORd value cannot be crafted by the attacker. If read-access of [rsp] is possible, this scheme is vulnerable to information leak of the return address (or another program pointer which lets the attacker derive the return address), because cookie = retaddr^[rsp]. But still this is better than the original RETGUARD which was easier to attack in both ways, either a leak of the stack position to get the return address or vice versa.
[+] kernoble|7 years ago|reply
What's meant by the term "gadget"? Does it mean an exploit mechanism? My searches are mostly showing information about USB device drivers.
[+] tptacek|7 years ago|reply
In exploit development parlance, a gadget is a block of assembly instructions executed outside their intended order by an attacker-induced control transfer. A gadget might start in the middle of a basic block, for instance, and be invoked when an attacker uses a memory corruption vulnerability to overwrite a function pointer with an address they control.

"Return oriented programming", which is kind of a dumb name, is the idea of harvesting gadgets from the text of a program and then using them as primitives for a new program. Gadgets are stitched together by the "return" instruction (hence the name ROP). When used by attackers this way, "ret" isn't really "returning" so much as it's being used as an arbitrary indirect jump mechanism.

[+] DeepYogurt|7 years ago|reply
Question; does this protection get applied to all languages that use llvm on openbsd or is it C/C++ specific?
[+] trillic|7 years ago|reply
How is this different than any other stack canary? Is it because it is per-function instead of per-stack like Linux?
[+] brandmeyer|7 years ago|reply
First off, the normal glibc/GCC stack canary is per-process, not per-stack.

(I started writing a more detailed reply based on the commit description, but there was too much speculation without seeing their source code).

[+] tptacek|7 years ago|reply
I don't pay enough attention to know how this is different from what PaX team did with RAP 3 years ago:

https://pax.grsecurity.net/docs/PaXTeam-H2HC15-RAP-RIP-ROP.p...

[+] PaXTeam|7 years ago|reply
this is basically the xor canary approach originally pioneered by the Stackguard guys (i'm pretty sure you were already around at the time though probably forgot such old history as did the rest of the world apparently ;). the OpenBSD implementation suffers from a few problems, mostly their own making:

1. if they can't find a register to load the cookie into, they'll silently skip instrumentation (i'm not sure how that would happen in practice but the silent treatment when omitting a security feature is a non-starter).

2. if they can find such a register then it'll be spilled to the stack and restored in the epilogue, so a normal buffer overflow can control both the xor'd retaddr and the retaddr itself and the only thing standing in the way of exploitation is the secret cookie value - not unlike with Stackguard/SSP.

3. one would think that a per-function cookie is an improvement but... they're shared among threads (in userland) or everything (in the kernel) so infoleaks are just as catastrophic as before (it'd certainly help if someone described a proper threat model for this defense). at least the kernel side should use a per-syscall cookie to make it somewhat resemble an actual defense mechanism (and there's some more described in my presentation).

4. the int3 stuffing before retn must be someone's joke 'cos it sure as hell won't prevent abusing the retn as a gadget. it does introduce a mispredicted branch for every single function return however.

[+] rurban|7 years ago|reply
Anyone seen this new ReturnProtectorPass also upstream at clang? It was written end of 2017 AFAIR, that when when we last discussed this here.

CFI still looks more promising to me though. Protecting the CALL part. But this is better than the old gcc/clang stack cookie of course, protecting RET.

[+] keyle|7 years ago|reply
When I first read this, I thought how come no one thought of this before? It feels like common sense.

Dumb question (potentially): will this make code that is not inlined, calling a function many times, often, run a lot slower?

[+] Mordak|7 years ago|reply
It depends on the function. Many things will contribute. If your CPU can keep the cookie in cache then loading it repeatedly will relatively fast compared to hitting main memory. If your branch predictor can figure out the jmp over the int3 instructions quickly then that will also be fast. If the function is very short then the retguard stuff will add relatively more instructions to the function so will have a larger impact than if the function was long, etc.

I found that the runtime overhead was about 2% on average, but there are many factors that contribute.

[+] Annatar|7 years ago|reply
Yes. Any extra instructions will slow things down. The number of times the code runs will be the amplifying factor.
[+] atonse|7 years ago|reply
While I didn't understand most of this, any reason why they're still using CVS? Genuinely curious, I'm sure they have good reasons.
[+] davisr|7 years ago|reply
The developer makes the code, not the tools. Some people prefer tools different to others.
[+] brynet|7 years ago|reply
You must be jealous that we're self-hosted. ;-)
[+] atonse|7 years ago|reply
I actually wasn't clear at all. To me, it's more about code browseability, and I've found it much easier to splunk through codebases for fun in a GitHub-like interface. (Not much to do with the developers' workflows, I suppose)

That's really where it's coming from, wasn't really trying to be snarky or anything.

[+] ams6110|7 years ago|reply
It works for them. Changing would be a lot of work for little benefit.

Note that there is a mirror on github at https://github.com/openbsd and to my understanding, developers who prefer git use that, but the official source tree is in CVS.

[+] busterarm|7 years ago|reply
They've managed to maintain a consistent roughly-6 month release cadence since 1995. CVS is working for them, so why switch?
[+] w8rbt|7 years ago|reply
It is simple, they know it well and it works. Just like C89.
[+] unknown|7 years ago|reply

[deleted]

[+] brynet|7 years ago|reply
This is not a dupe! That is a previous incarnation of RETGUARD that wasn't committed. The implementations are very different, please undo.
[+] mtgx|7 years ago|reply
Speaking of OpenBSD, does anyone know what happened here and how true it is?

https://arstechnica.com/information-technology/2010/12/fbi-a...

It would certainly jive with John Gilmore's story on how the NSA worked through the standard bodies to keep IPSEC easily exploitable by making the design too difficult to implement properly:

https://www.mail-archive.com/[email protected]/msg12...

Their behavior around Simon & Speck and how they refused to reveal details on how exploitable they could be also seems to be similar to their previous tactics.

This is why it's worrisome that Google intends to implement Speck in Android and have pushed it to the Linux kernel, too.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

More details on how the NSA has been sabotaging open source projects here:

https://www.youtube.com/watch?v=fwcl17Q0bpk

[+] admax88q|7 years ago|reply
Don't have all my sources on hand, but the last time I looked in to this the general conclusion I came to was that there's evidence to suggest that someone was in fact paid to put vulnerabilities into the IPSec stack of OpenBSD. But there was no evidence to suggest that those vulnerabilities ever got written or if they were written that they ever made it into the source tree.

I believe OpenBSD conducted an audit of their tree when rumours of an IPSec backdoor started and didn't find anything alarming.

[+] forapurpose|7 years ago|reply
The issue has been discussed many times on HN. My guess is that people don't want to revisit it (and I don't know enough off the top of my head to write a good answer). Look at HN history and you can find much of what you need.