top | item 26804994

Morpheus Turns a CPU into a Rubik’s Cube to Defeat Hackers

60 points| kaisix | 5 years ago |spectrum.ieee.org | reply

30 comments

order
[+] bawolff|5 years ago|reply
I wish this was more a technical paper. The interview tries to dumb it down to the point where you can't follow what they actually did, or what type of attacks are actually in scope.

Is this just encrypting pointers, where the encryption key changes very frequently?

[+] kazinator|5 years ago|reply
There is a technical paper; just that's not it.
[+] teddyh|5 years ago|reply
> What is the overhead for Morpheus?

> Todd Austin: It’s about 10 percent slower on average.

If you accept that kind of overhead, why not just go the Burroughs B5000/Lisp Machine/etc. route and have a hardware instruction set which is essentially byte code for a safe machine, which cannot be tricked into reading and writing in arbitrary places.

Of course, then you would still have the problem of actual security holes in programs which has bugs, which Morpheus effectively hides, by obscuring them.

[+] bombcar|5 years ago|reply
A 10% performance hit for "secure mode" doesn't seem at all that bad - run at last year's speed for an additional layer of security.

Something like this for IoT devices might actually provide some of the missing security thereof.

[+] Jiocus|5 years ago|reply
I think you're omission of what Todd Austin really answered in the article, on this question, is close to insincere.

> Todd Austin: It’s about 10 percent slower on average. We could reduce those overheads if we had more time to optimize the system. You know, we’re just a couple of grad students and some faculty. If a company like Intel, AMD, or ARM built this, I would expect that you’d get the overheads down to a few percent.

Edit: Take into account current performance hits for e.g. Spectre, Meltdown patches.

[+] voldacar|5 years ago|reply
I'm a bit confused - how do you do pointer arithmetic if the cpu encrypts pointers for your automatically and doesn't let you know the actual address in memory that they point to?
[+] joshka|5 years ago|reply
I know nothing of the implementation, but hypothetically a compiler could translate:

    p++
to instructions that effectively do:

    p = enc(dec(p)+1)
Without exposing p outside the CPU. The implementation described (seems to be) new instructions for incrementing and dereferencing pointers.
[+] kazinator|5 years ago|reply
This is, from what I gather from the paper, because the pointers are not individually encrypted via cipher. Rather, pointers in the same domain (domains being identified via a tagging scheme) are all displaced together by a common value, which comes from that encryption engine. So imagine that some pointers p and q are actually p + c and q + c under the hood. When the machine churns, c will change, and so all pointers have to be updated. There is some sort of moving barrier which indicates which parts of memory are "clean" (ready for use) and which are still using the old key.
[+] dejj|5 years ago|reply
> you change what it means to add a value to a pointer.

Morpheus encrypts pointers. How does it allow C to access an array or a struct and prohibit access beyond bounds to the return address somewhere beyond the bounds of said array or struct, I wonder.

[+] k_sze|5 years ago|reply
Maybe it uses homomorphic encryption?
[+] jstanley|5 years ago|reply
Perhaps it could have pointer arithmetic instructions, where "ptr + n" makes code that uses special instructions to decrypt "ptr", add "n", and encrypt the result before giving it back to you?
[+] bansuian|5 years ago|reply
This reminds of Ghost in the Shell, the anime. There is a concept of a barrier maze that has to be solved before gaining access. There is also a concept of an attack barrier if one tries to force their way into a system.
[+] peter_d_sherman|5 years ago|reply
>"IEEE Spectrum: So you can detect an attack just by looking at what’s being looked at?

Todd Austin: Yeah. Let me give you a classic example—the return stack. [When a program “calls” a subroutine into action, it stores the address the program should return to when the subroutine is over in the return stack.] This is a real popular place for attackers to manipulate to do what’s called a buffer overflow or a code injection. There’s only one part of the program that writes and reads from the return address, and that’s the beginning of a function and the end of a function. That’s really the semantics of the return address. But when you see a program attacking the return address to do code injection, for example, you’ll see writes from many different places in the program. So, these sort of unexpected, undefined accesses, normal programs never do. That’s a tell-tale sign that there is an attack going on. Now, some systems might try to detect that. And when they detect that, then they can kill the program."

PDS: Observation: An x86 emulator, such as Bochs, could be programmed such that it "knew" about where any given program/processes' stack is -- and explicitly looked for/flagged/logged/warned/alerted and possibly stopped (and dumped for further analysis) -- programs/processes whose stacks were accessed/changed in questionable ways...

Also, this leads to my next thought...

Why couldn't all processes' stacks -- have some form of hardware/CPU protection?

I mean, we have Ring 0 for OS code, and Ring 3 for user program code -- this leaves Rings 1 and 2 free for other purposes... well, why not put program/process stacks there, and guard access to them via the OS?

(Now, a better idea would be a custom CPU designed to know about and explicitly protect each process's stack memory -- but short of that, x86 has those extra rings...)

Yes, it would make programs slower.

Possibly much slower, depending on what's being done. And of course, the program wouldn't have immediate access to its stack variables, without OS intervention...

Maybe part of the solution would be in what Forth did a long time ago -- separate the call/return (control) stack from the local variables stack...

Like, maybe if the call/return stack -- is placed in a Ring that the OS controls (requiring an OS syscall -- which could also log/alert in the case of an anomaly), but the data (local variable) stack is still kept in a common area...

Yes, compilers would have to be modified to support that scheme, and yes, it would be slower... but could it work?

I don't think it would prevent all kinds of attacks (I don't think anything could) -- but it might help mitigate some of the more common stack-based attacks...

Also, the idea of encrypted pointers -- both on the stack and not -- is a good idea.

It would be slow if not implemented in hardware, unless the encryption was a simple XOR with some known-only-at-runtime value (which would be a bit slower than without it) -- but that would be the fastest thing that could implement an approximation of that scheme in software...

Anyway, a lot of interesting ideas in this paper...

[+] qw3rty01|5 years ago|reply
Most of these already exist

Rings are code privilege levels, they don't "protect" memory, page tables do that, and the implementation to protect the stack has been around for over a decade - DEP

call/return stack detection is very new, afaik only current generation intel processors have it under the name CET (control-flow enforcement technology). This works by a processor-level "shadow stack," which is pushed to on calls and popped from on returns, and throws a fault if the actual stack is returning to a different location

Pointer encryption is available on ARM as PAC (pointer authentication). Afaik there's no equivalent on x86 processors.

Hardware mitigations are gaining more and more traction, so hopefully all of them will be available across all major architectures within the next decade or so.

EDIT: another huge mitigation that's starting to gain traction is process-level virtualization. Qubes is most known for it, but Windows has been building more support for it (currently only edge and office support it, under the name "Application Guard"). The idea is that if a malicious actor is able to exploit a program, they would also have to break out of the vm the app is in to gain access to the full system, which can be a much more difficult task.

[+] tandr|5 years ago|reply
Isn't it sort of what OpenBSD(?) does with having return/entrance pointers masked with an additional bits that only kernel knows? (my memory is really vague on details, sorry)