top | item 17767060

The Jury Is In: Monolithic OS Design Is Flawed [pdf]

172 points| ingve | 7 years ago |ts.data61.csiro.au | reply

194 comments

order
[+] pweissbrod|7 years ago|reply
I wonder if there exists a parallel dimension where linux is microkernel design and folks are pushing for monolothic citing the driver friendliness and performance
[+] andrewstuart2|7 years ago|reply
If parallel dimensions exist, then it's most certainly one of the closest dimensions to ours.

If I'm sure of one thing, it's that as soon as we decided to build everything as Microkernels, we'd have the same squeaky wheels touting the massive benefits of Monotlithic OS design. We're hilariously cyclical in our preferences.

"Think of all the runtime efficiencies of the shared memory, and how much easier it would be to develop on the shared codebase!"

[+] lainga|7 years ago|reply
the Hurd-verse lives! Richard Stallman is a clean-shaven, foulmouthed autocrat, and Linus is composing folk songs about joining hands with Intel and Nvidia!
[+] SkyMarshal|7 years ago|reply
It would be the dimension where Andrew Tannenbaum licensed Minix3 under a Free license back before Linus hacked up his own monolithic kernel version.
[+] pjmlp|7 years ago|reply
Lack of performance on microkernels is a myth nowadays.

QNX and many embedded OS, some of which driving high integrity software, are all microkernel based.

Including the one most likely handling the real time communication of this mobile radio.

[+] Sporktacular|7 years ago|reply
And in that dimension there would be debate about whether we should cut corners by reducing security and reliability to slightly improve speed and mollify lazy programmers.

That debate would be short and the answer would be "nope, we shouldn't because that would be stupid".

[+] snaky|7 years ago|reply
https://yarchive.net/comp/microkernels.html

edit added

> Guys, there is a _reason_ why microkernels suck. This is an example of how things are _not_ "independent". The filesystems depend on the VM, and the VM depends on the filesystem. You can't just split them up as if they were two separate things (or rather: you _can_ split them up, but they still very much need to know about each other in very intimate ways).

https://yarchive.net/comp/linux/user_space_filesystems.html

[+] MisterTea|7 years ago|reply
You mean a parallel universe where GNU/HURD was actually finished.
[+] nickpsecurity|7 years ago|reply
I dont know haha. I do know most microkernels in commercial space support POSIX or Linux in user mode. OKL4 and L4Linux also supported using a minimal version of Linux just to get its device drivers. Then, a native app or other VM use them with virtual drivers.
[+] JdeBP|7 years ago|reply
This would be the world of Windows NT in between version 3.5 and 4.0.
[+] jacquesm|7 years ago|reply
The jury was in in the mid 1990's, but Linus Torvalds doesn't know when he's wrong and to listen to his betters. Linux succeeded because of its community, not because of its architecture. QnX has shown the strength of microkernels for decades, they are far more stable and much easier to work on than monoliths. The (small) speed penalty should be well worth the price of admission.
[+] enitihas|7 years ago|reply
Reminds me of the famous Torvalds Tanenbaum debate. https://groups.google.com/forum/m/#!topic/comp.os.minix/wlhw...
[+] CGamesPlay|7 years ago|reply
Ah, this was great!

> Linus "my first, and hopefully last flamefest" Torvalds

[+] travbrack|7 years ago|reply
>I also agree that linux takes the non-portability to an extreme: I got my 386 last January, and linux was partly a project to teach me about it.

Times sure have changed

[+] thsowers|7 years ago|reply
Thanks so much for the share, great gems in here, I still found myself surprised at this:

> True, linux is monolithic, and I agree that microkernels are nicer.

It's hard to believe that he was only 23 when he wrote this

[+] acd|7 years ago|reply
There is a reason why kernel code run in privileged mode, speed! If you run more kernel code in privileged mode then you do not need to copy as much data between the kernel and user space. Vs a micro kernel you will have to copy more data up to user space. Copying data to user space causes context switches and gives less performance.

Larger mono kernels: Speed

Micro kernels have advantages such as: smaller privileged attack surface and thus more secure, more crash proof as you can restart user land processes for example device drivers

https://en.wikipedia.org/wiki/Microkernel

[+] naasking|7 years ago|reply
> If you run more kernel code in privileged mode then you do not need to copy as much data between the kernel and user space.

Microkernels don't copy data into the kernel address space, they copy data between userland address spaces. Which still happens in monolithic systems anyway when you're doing IPC. These are typically short messages, often only data passed in registers.

And if copying is going to be a bottleneck, then you negotiate a shared address space just like in Unix, and no more copying.

[+] Sporktacular|7 years ago|reply
In a world of constant security threats and 2 GHz CPUs dedicated to cat videos - needing speed can no longer be the excuse for poor design.

And "embedded CPUs need every precious cycle" is not an argument either. As the paper says modern microkernals have a negligible speed penalty while IoT/networked industrial controllers are a security backwater.

[+] pradn|7 years ago|reply
Do you really need to do a full copy? What if you had a shared page and notified the user process when the data in the page was available at a certain offset.

Take a look at this paper (FlexSC: Flexible System Call Scheduling with Exception-Less System Calls):

https://www.usenix.org/legacy/event/osdi10/tech/full_papers/...

[+] hinkley|7 years ago|reply
Code doesn't run faster because it's in the kernel. The speed you're talking about comes from avoiding transitions in and out of a given space. If you stay out or stay in the results are pretty similar.

Except for your tooling. Cloudflare article from a couple years ago on why they don't use user-space network stack: https://blog.cloudflare.com/why-we-use-the-linux-kernels-tcp... and the tl:dr is a profound lack of feature parity. They use everything from iptables to tcpdump. If someone else worked on feature parity (they say it's too expensive for too small a gain for them), I expect they'd change their tune.

[+] imglorp|7 years ago|reply
The title is missing "from a security standpoint". Of course, everything is a tradeoff. TLDR:

> We have presented what is, to the best of our knowledge, the first quantitative empirical assessment of the security implications of operating system structure, i.e. monolithic vs microkernel-based design.

> Our results provide very strong evidence that operating- system structure has a strong effect on security. 96% of crit- ical Linux exploits would not reach critical severity in a microkernel-based system, 57% would be reduced to low severity, the majority of which would be eliminated alto- gether if the system was based on a verified microkernel. Even without verification, a microkernel-based design alone would completely prevent 29% of exploits.

> Given the limited number of documented exploits, we have to assume our results to have a statistical uncertainty of about nine percentage points. Taking this into account, the results remain strong. The conclusion is inevitable:

> From the security point of view, the monolithic OS design is flawed and a root cause of the majority of compromises. It is time for the world to move to an OS structure appropriate for 21st century security requirements

[+] gjm11|7 years ago|reply
So, they've looked at a sample of exploits that were critical on Linux and established that most wouldn't have been critical on a hypothetical otherwise-similar microkernel system.

But they haven't looked at a sample of exploits that were critical on an actual microkernel OS and seen how many would have been less serious (or not arisen) on a hypothetical otherwise-similar monolithic-kernel system.

It reminds me of a nice observation in "Surely you're joking, Mr Feynman". Feynman developed some nonstandard ways of solving mathematical problems. Other people came to him and he repeatedly solved problems they'd been stuck on. "He must be much smarter than us!" But the problems they brought to him were selected as ones they couldn't do, so of course he'd look better than them on those. They never bothered asking him problems he couldn't do but they could, because they'd already done them.

Now, maybe the authors of the paper are confident that there's no way that a microkernel design could encourage or exacerbate vulnerabilities. But so far as I can see they don't offer any actual argument for that proposition.

[+] da_chicken|7 years ago|reply
> The title is missing "from a security standpoint".

I mean, kind of, but since maintaining system security and integrity is a core function of the OS -- in fact, it is the primary gatekeeper in terms of all system security -- it means that "being secure" and "being correct" are often synonymous terms for an operating system.

After all, if we don't care about security at all we can all run CP/M or run everything as root.

Now, sure, you can say that the whole thing is bullshit because verified microkernels are so difficult to design that the end result would be an unusable system, but all that suggests is that when you design your kernel you should aim for a hybrid and compromise more on the side of a microkernel where you can.

[+] digi_owl|7 years ago|reply
The age old joke about the computer encased in concrete at the bottom of the ocean does indeed come to mind.
[+] fulafel|7 years ago|reply
It would be more credible if the authors were able to distinguish between exploit and vulnerability.
[+] Steko|7 years ago|reply
I examined all fatal car crashes in the United States between pi day and Bloomsday in 2015 and assigned them a Mitigation Score based on the hypothetical that the people involved were instead walking. 98.3% of fatalities would have been prevented. The jury is in: ban all horseless carriages.
[+] nasoieu|7 years ago|reply
That chart with the growth of the Linux kernel discredits everything. The Linux kernel continues to grow because they are obsessed with keeping all drivers in mainline instead of having a stable API for them as any sane project would.
[+] rhencke|7 years ago|reply
I believe one of the reasons Linux continues to be successful today is _because_ they keep the drivers in mainline _without_ a stable API.

It is a key motivator to ensure drivers remain available and supported well into the future of Linux.

[+] drewg123|7 years ago|reply
RHEL does provide a stable kernel ABI (kABI) that can be and is used by vendors to ship binary drivers. See https://elrepo.org/tiki/FAQ

When I worked for a NIC hardware vendor, we would ship our driver in 4 forms:

1) source tarball

2) upstream kernel

3) RHEL/Centos kABI compliant source and binary rpms

4) Debian pkg using dkms

The upstream kernel driver wasn't good enough for a variety of reasons. For example, on Ubuntu LTS and RHEL, the in-tree driver was often based on a kernel that was several years old and which lacked support for recent hardware or features.

[+] Jweb_Guru|7 years ago|reply
It's not misleading because most Linux kernel drivers run in kernel space; hence compromising them indeed potentially compromises the whole system, which is exactly the article's point. The fact that they're often buggy and poorly supported, unlike the "real" kernel, makes things worse and doesn't invalidate anything.
[+] naasking|7 years ago|reply
> That chart with the growth of the Linux kernel discredits everything.

What claim does it discredit exactly?

[+] AnIdiotOnTheNet|7 years ago|reply
If you insist that all drivers must be a part of the kernel, then it is perfectly fair to count all driver code as part of the kernel code.
[+] digi_owl|7 years ago|reply
While equally obsessed with maintaining a stable API towards userspace. Sadly much of userspace is an unstable churn of API changes.
[+] gnufx|7 years ago|reply
Contrary to what's written about microkernel speed: Systems I used in the '80s to great effect used at least the moral equivalent of a microkernel ("Nucleus"). They were fast (compared with VAXen etc.) for interactive use and supported real-time processes. (Some visitors thought context switching was a bit slow, assuming "microseconds" meant "milliseconds".) The filesystem was fast enough to dispel the assumption a "database" was always required for speedy experimental data access rather than a file per spectrum.

https://en.wikipedia.org/wiki/OS4000

The performance wasn't just because Nucleus initially was in hard/firmware; two later software implementations were performant (on faster hardware). Also, as the article is about security: at least the original Nucleus also supported an A1/B3-level secure OS.

[+] mikkergp|7 years ago|reply
Well, the obvious solution is to design our Kernel's on Kubernetes.
[+] nwmcsween|7 years ago|reply
IMO a microkernel isn't a design worth pursuing as there will always be overhead. Instead an exokernel with a simple monolithic 'multiplexing' kernel or a language that has 100% safety (not really possible).
[+] swiley|7 years ago|reply
In a perfect wrold microkernel OSes would be perfect but then it's all pointless.

In real life there are certain parts of the OS that have to work or the whole device stops working. Furthermore: the isolation of dynamic and less tested application code from these parts is generally a good idea, that's why monolithic OSes are so popular; they're simply less demanding.

[+] Sporktacular|7 years ago|reply
Some super low quality commentary here. Straight to the Monolithic/Microkernal tribalism, talk of parallel dimensions, as long as we ignore the content of the paper - which remains far more empirical and convincing than any rebuttal seen here.
[+] nine_k|7 years ago|reply
I'd hazard to say that every design is "flawed" in some regards: there's no way to achieve all desirable qualities and none of undesirable qualities. For one, some desirable qualities contradict each other.

So "${thing} is flawed" is not precise enough; an interesting statement would be "${thing} is not the best choice for ${conditions}". A monolithic OS is not the best choice for a high-reliability system on unreliable hardware. A microkernel OS that widely uses hardware memory protection is not the best choice for a controller with 4KB of RAM. A unikernel setup is not the best choice for a desktop system where the user is expected to constantly install new software. Etc, etc.

In other words, the ancient concept of "right tool for the job" still applies.

[+] renox|7 years ago|reply
OK so seL4 is safer than Linux, that's not really news.. I have questions about seL4 though: is it able to manage several multicore CPU efficiently? What about power management, does it work?
[+] matachuan|7 years ago|reply
Of course it's about Gernot Heiser's verified ukernel...
[+] justicezyx|7 years ago|reply
Now it's a good time to revive microkernel for prime as server less offers the abstraction for making transition transparent to users and application developers.

Anyone tried that?

[+] brewski|7 years ago|reply
This is like saying Linux is more secure than Windows because not all of Window's critical vulnerabilities appear in Linux.
[+] throw7|7 years ago|reply
The Jury Is In: Microkernel OS Design Is Flawed.

See I can make clickbait titled papers also.

[+] nkkollaw|7 years ago|reply
Linux is the most popular OS in the world (of course, counting Android), so I guess this means it doesn't really matter is monolithic is flawed, or it is not flawed enough.