top | item 16052451

Linux page table isolation is not needed on AMD processors

640 points| fanf2 | 8 years ago |lkml.org | reply

293 comments

order
[+] twotwotwo|8 years ago|reply
Was the connection with speculative execution already being discussed openly? I know about https://cyber.wtf/2017/07/28/negative-result-reading-kernel-..., but not about anything between that and 28 Dec suggesting someone made it work and that's the reason for KPTI.

If it wasn't in the open, seems...not ideal embargo-wise for AMD to leak it there. Though no one's in that thread complaining about the disclosure, so maybe they either think that part is already known to anyone looking closely, or just don't think it's a very big piece of the exploit puzzle (like, finding the way to get info out a side channel was the hard part).

[+] AnssiH|8 years ago|reply
> Though no one's in that thread complaining about the disclosure, [...]

I imagine if someone had complaints they would make them in private so as to not make the situation even less ideal embargo-wise.

[+] benmmurphy|8 years ago|reply
https://twitter.com/dougallj has released source code (https://t.co/vaaMyajriH) which partially reproduces the problem. you need a little bit of tweaking to read kernel memory and to read the actual values. from his twitter and from i've observed sometimes the speculative code will see 0 and sometimes it will see the correct value. he speculates that it might work if the value is already in the cache.
[+] electic|8 years ago|reply
This is going to have dramatic effect on the cloud computing market. It might make sense to make sure any VMs you run are on AMD processors or it can really hurt your performance and basically cost you more to do the same workload.

It also seems, from early benchmarks, this can slaughter performance with databases.

[+] bhouston|8 years ago|reply
I wonder if cloud providers will ask Intel for partial refunds when their CPUs get 5% to 30% slower than promised?
[+] yeukhon|8 years ago|reply
Why are people insisting this affects cloud computing market? I am not sure if this bug is absolutely limited to cloud instances.
[+] vasili111|8 years ago|reply
Don't worry. I don't think that there will be two separate kernels for Intel and AMD. I think performance drop will be on both CPUs no matter has it the bug or not.
[+] anonacct37|8 years ago|reply
This feels like a big FU to Intel. I've heard this patch can slow down programs like du by 50%. Does that mean AMD is going to find itself running twice as fast as competitors?
[+] jandrese|8 years ago|reply
I think the du case was an outlier. Normal workloads shouldn't be so heavily affected. I am expecting a few percent loss on most programs though. It's basically a larger penalty for making a syscall, which was already a fairly slow operation so performance minded people avoid them in tight loops. It will be bad for people who need to do lots of fast I/O I suspect.
[+] blattimwind|8 years ago|reply
"The overhead was measured to be 0.28% according to KAISER's original authors,[2] but roughly 5% for most workloads by a Linux developer.[1]" [1] = https://lwn.net/Articles/738975/

Though the patches evolved since then. So I guess we'll see.

[+] SolarNet|8 years ago|reply
Yes. AMD didn't take shortcuts, and implemented the spec correctly. Intel took shortcuts, introduced bugs, and now to compensate for that the OS has to work around it in software, it's going to be slow. For years Intel has reaped the benefits of shortcuts for performance, while AMD has been implementing things correctly; now there is a correction.

That's how the market works.

[+] kzrdude|8 years ago|reply
The 50% figure is from a benchmark that didn't run on an Intel CPU!
[+] tankenmate|8 years ago|reply
The text as written only seeks to defend AMD's product. Whether the sub text goes further is open to non objective speculation. Having said that I'm sure AMD are feeling pretty happy with their statement. Schadenfreude may be too long a bow...
[+] thinkMOAR|8 years ago|reply
Hmm i was wondering about the performance hit, and i kind of miss performance details in the report as i consider it to be significant to report.
[+] bitwind|8 years ago|reply
All Intel CPU's are affected, mitigation syscall overhead increased by 50%, and none of AMD CPU's affected? I would say this could be an indicator to short INTC and long AMD...
[+] IgorPartola|8 years ago|reply
Short INTC maybe but I am not sure this means that AMD will increase in value over the long run as a result of this one incident.
[+] 0x00000000|8 years ago|reply
If the hit is as bad as they say (30% performance), cloud providers will be almost forced to upgrade when the new hardware comes out that fixes it. Are they really ready to adopt AMD? Go long on INTC?
[+] rdtsc|8 years ago|reply
> I would say this could be an indicator to short INTC and long AMD...

I would say that too if I'd be waiting for everyone to sell so then I could buy INTC :-)

[+] cjbprime|8 years ago|reply
Are you sure that all Intel CPUs are affected? Might just be older ones.
[+] artellectual|8 years ago|reply
Essentially looks like Intel compromised (whether intentional or not is a different point) the design to get the speed boost that gave them the lead over AMD for the past decade. Will be interesting to see how all this plays out.
[+] jchw|8 years ago|reply
Other than leaking timing information though, is there any reason why this kind of speculative execution can't be secure? Apparently we're going to find out more in the coming weeks, but it feels strongly like Intel has made a number of mistakes leading up to this.
[+] rootlocus|8 years ago|reply
> Essentially looks like Intel compromised (whether intentional or not is a different point)

If it wasn't intentional, then it wasn't a compromise. So it's not a different point.

[+] bhouston|8 years ago|reply
What chip exactly introduced this feature?

Core 2 architecture? Nehalem?

[+] mindcrash|8 years ago|reply
So first they bring a DLC concept ("unlock features by spending money") to their enthousiast platform, and now this?

Having a hunch Threadripper will sell extremely well amongst PC enthousiasts this year...

[+] api|8 years ago|reply
At the meta level this is just a special case of "complexity is evil" in security. CPUs have been getting more and more complex, and the relationship between complexity and bugs (of all types) is exponential. Each new CPU feature exponentially increases the likelihood of errata.

A major underlying cause is that we're doing things in hardware that ought to be done in software. We really need to stop shipping software as native blobs and start shipping it as pseudocode, allowing the OS to manage native execution. This would allow the kernel and OS to do tons and tons of stuff the CPU currently does: process isolation, virtualization, much or perhaps even all address remapping, handling virtual memory, etc. CPUs could just present a flat 64-bit address space and run code in it.

These chips would be faster, simpler, cheaper, and more power efficient. It would also make CPU architectures easier to change. Going from x64 to ARM or RISC-V would be a matter of porting the kernel and core OS only.

Unfortunately nobody's ever really gone there. The major problem with Java and .NET is that they try to do way too much at once and solve too many problems in one layer. They're also too far abstracted from the hardware, imposing an "impedance mismatch" performance penalty. (Though this penalty is minimal for most apps.)

What we need is a binary format with a thin (not overly abstracted) pseudocode that closely models the processor. OSes could lazily compile these binaries and cache them, eliminating JIT program launch overhead except on first launch or code change. If the pseudocode contained rich vectorization instructions, etc., then there would not be much if any performance cost. In fact performance might be better since the lazy AOT compiler could apply CPU model specific optimizations and always use the latest CPU features for all programs.

Instead we've bloated the processor to keep supporting 1970s operating systems and program delivery paradigms.

It's such an obvious thing I'm really surprised nobody's done it. Maybe there's a perverse hardware platform lock-in incentive at work.

[+] zer00eyz|8 years ago|reply
I have to wonder:

Can intel release a drop in CPU that will avoid or mitigate this issue?

The infrastructure investment in intel cores is huge, if a drop in replacement lets me minimize downtime, re-gain performance and is "cost effective" compared to a cost prohibitive replacement does this result in intel having a sales INCREASE where it replaces bad silicon?

I don't know enough about this issue to speak to the issue either way, but I would love to hear if this fix is possible/viable.

[+] nothrabannosir|8 years ago|reply
> … a sales INCREASE …

Don’t forget to correct for the subtle loss in credibility, and subsequent immeasurably subtle dip in sales, amortised over… forever.

[+] nine_k|8 years ago|reply
I wonder if this can be fixed at firmware level. (I frankly have no idea how deeply configurable Intel cores are.)
[+] ac29|8 years ago|reply
Until more information is available, who knows. It might be fixable in microcode, it might be fixable in a new processor stepping, it might require a deeper rework that wont come out until the next generation of processors (or even the generation after that).
[+] rbanffy|8 years ago|reply
Wouldn't this kind of issue validate the ideas of microkernel-based OSs, where kernel and user spaces are already completely separated?

BTW, removing the kernel from the non-privileged address space seems like such a great idea (which is not a new one at all) the whole thing should probably should have some hardware support to be made fast.

[+] airesQ|8 years ago|reply
Given Intel's dominance of the server market does this mean that datacenter computational capacity will see an overnight ~5% drop?

Is there enough spare capacity to cope with this? Will spot-instance prices go up? Will I need more instances of a given type to run the same workload?

[+] userbinator|8 years ago|reply
All that I've read about this so far seems to indicate that it's only a way to bypass KASLR... which is itself not really a problem, but there must be something more to it. Given that it doesn't affect AMD, perhaps it's related to Intel ME?
[+] zippie|8 years ago|reply
Data structures stored in kernel space, such as llds [1], will not incur the overhead of the TLB flush/load.

I suspect that storing data in the kernel space in order to avoid maintaining a large application PD will become the norm, whereas in the past it has been reserved for use cases like search engines with massive in-memory trees.

[1] https://github.com/johnj/llds

[+] czeidler|8 years ago|reply
Would it be possible to slow down segfault notifications to mitigate the attack? For example, if the segfault was not on kernel space, halt the application for the time offset of a kernel read. In this way all segfaults would be reported at more or less the same time and the attack could be avoided.

Are there any sane apps that depends on timely segfault handling and thus might be affected by such a workaround?

[+] caf|8 years ago|reply
It's not timing the segfault delivery itself, the idea is to time another read of your own address space after the fault to see if it's been prefetched or not.

Maybe you could CLFLUSH on segfault delivery though.

[+] jopsen|8 years ago|reply
I sometimes wonder if verifying properties of the code we run wouldn't be smarter than relying on hardware isolation. Or at-least in addition to hardware isolation, so that there is two layers.

By verify I'm thinking NativeClient-like or JVM isolation.

Obviously, it would entail complete OS rewrite, or maybe partial...

[+] sandworm101|8 years ago|reply
Lol. Was already very happy with my ryzen 1800 bought a couple months ago. Even more happy today.
[+] rdudek|8 years ago|reply
How does all this affect everyday regular users?
[+] rbanffy|8 years ago|reply
Would it make sense to switch core at the same time the context is switched between user and kernel? The hit with cache is already there and, if one could go back and forth to already primed caches on different cores, at least some of the performance issues would be mitigated.