top | item 38852652

(no title)

A memory safe linux kernel would be a fairly incredible thing. If you could snap your fingers and have it, the wins would be huge.

Consider that right now a docker container can't be relied upon to contain arbitrary malware, exactly because the Linux kernel has so many security issues and they're exposed to containers. The reason why a VM like Firecracker is so much safer is that it removes the kernel as the primary security boundary.

Imagine if containers were actually vm-level safe? The performance and operational simplicity of a container with the security of a VM.

I'm not saying this is practical, at this point the C version of Linux is here to stay for quite a while and I think, if anything, Fuschia is the most likely successor (and is unlikely to give us the memory safety that a Rust kernel would). But damn, if Linux had been built with safety in mind security would be a lot simpler. Being able to trust the kernel would be so nice.

edit: OK OK. Yeesh. I meant this to be a hypothetical, I got annoyed at so many of the replies, and this has spiraled. I'm signing off.

I apologize if I was rude! Not a fun start to the morning.

discuss

opportune|2 years ago

Memory safety isn’t why containers are considered insufficient as a security boundary. It’s exposing essentially the entire Linux feature surface, and the ability to easily interact with the host/other containers that makes them unsafe by themselves. What you’re saying about VMs vs containers makes no sense to me. VMs are used to sandbox containers. You still need to sandbox containers if your kernel is written in rust

Even just considering Linux security itself: there are so, so many ways OS security can break besides a slight (you’re going to have to use unsafe a whole lot) increase in memory safety

jvanderbot|2 years ago

The culture around memory safe languages is a positive improvement for programmer zeitgeist. Man though the overreach all the way to "always safe forever" needs to be checked.

insanitybit|2 years ago

[deleted]

snvzz|2 years ago

If you find this amazing, perhaps you should take a look at seL4, which has formal proofs of correctness, going all the way down to the generated assembly code still satisfying the requirements.

It also has a much better overall architecture, the best currently available: A third generation microkernel multiserver system.

It provides a protected (with proof of isolation) RTOS with hard realtime, proof of worst case timing as well as mixed criticality support. No other system can currently make such claims.

plagiarist|2 years ago

I wish L4 had taken off for general purpose computing. That and Plan9 are things I'd really like to try out but I don't have space to fit operating systems in amongst the other projects. They both strike me as having the Unix nature, either "everything is messages in userspace processes" or "everything is a file."

packetlost|2 years ago

Ok, but can I run a desktop on it? Not knocking seL4, it's damn amazing, but it's not exactly a Linux killer.

px43|2 years ago

Eh, seL4 has a suite of tools that turn their pile of C and ASM into an obscure intermediate language that has some formally verifiable properties. IMO this is just shifting the compiler problem somewhere else, into a dark corner where no one is looking.

I highly doubt that it will ever have a practical use beyond teaching kids in the classroom that formal verification is fun, and maybe nerd-sniping some defense weirdos to win some obscene DOD contracts.

Some day I would love to read a report where some criminal got somewhere they shouldn't, and the fact that they landed on an seL4 system stopped them in their tracks. If something like that exists, let me know, but until then I'm putting my chips on technologies that are well known to be battle tested in the field. Maestro seems a lot more promising in that regard.

t8sr|2 years ago

Container vulnerabilities are rarely related to memory bugs. Most vulnerabilities in container deployments are due to logical bugs, misconfiguration, etc. C-level memory stuff is absolutely NOT the reason why virtualization is safer, and not something Rust would greatly improve. On the opposite end of the spectrum, you have hardware vulnerabilities that Rust also wouldn't help you with.

Rust is a good language and I like using it, but there's a lot of magical thinking around the word "safe". Rust's definition of what "safe" means is fairly narrow, and while the things it fixes are big wins, the majority of CVEs I've seen in my career are not things that Rust would have prevented.

insanitybit|2 years ago

> Container vulnerabilities are rarely related to memory bugs.

The easiest way to escape a container is through exploitation of the Linux kernel via a memory safety issue.

> C-level memory stuff is absolutely NOT the reason why virtualization is safer

Yes it is. The point of a VM is that you can remove the kernel as a trust boundary because the kernel is not capable of enforcing that boundary because of memory safety issues.

> but there's a lot of magical thinking around the word "safe"

There's no magical thinking on my part. I'm quite familiar with exploitation of the Linux kernel, container security, and VM security.

> the majority of CVEs I've seen in my career are not things that Rust would have prevented.

I don't know what your point is here. Do you spend a lot of time in your career thinking about hardening your containers against kernel CVEs?

peoplefromibiza|2 years ago

> if Linux had been built with safety in mind security would be a lot simpler

I'm replying simply because you're getting defensive with your edits, but you're missing a few important points, IMO.

First of all, the comment I quoted falls straight into the category of if only we knew back then what we know now.

What does it even mean "built with safety in mind" for a project like Linux?

No one could predict that Linux (which was born as a kernel) would run on billions of devices that people keep in their pockets and constantly use for everything, from booking a table at the restaurant to checking the weather, from chatting with other people to accessing their bank accounts. And that said banks would use it too.

Literally no one.

Computers were barely connected back then, internet wasn't even a thing outside of research centers and universities.

So, what kind of safety should he have planned for?

And to safeguard what from what and who from who?

Secondly, Linux was born as a collaborative effort to write something already old: a monolithic Unix like kernel, nothing fancy, nothing new, nothing experimental, just plain old established stuff for Linus to learn how that kernel thing worked.

The most important thing about it was to be a collaborative effort so he used a language that he and many others already knew.

Did Linus use something more suited for stronger safety guarantees, such as Ada (someone else already mentioned it), Linux wouldn't be the huge success it is now and we would not be having this conversation.

Lastly, the strongest Linux safety guarantee is IMO the GPL license, that conveniently all these Rust rewrites are turning into more permissive licenses. Which steers away from what Linux was, and still largely is, a community effort based on the work of thousands of volunteers.

bigstrat2003|2 years ago

> Lastly, the strongest Linux safety guarantee is IMO the GPL license, that conveniently all these Rust rewrites are turning into more permissive licenses. Which steers away from what Linux was, and still largely is, a community effort based on the work of thousands of volunteers.

There is nothing about permissive licenses which prevents the project from being such a community effort. In fact, most of the Rust ecosystem is a community effort just like you describe, while most projects have permissive licenses. There's no issue here.

K0nserv|2 years ago

I largely agree, but this seems quite unfair to Linux.

> But damn, if Linux had been built with safety in mind security would be a lot simpler. Being able to trust the kernel would be so nice.

For its time, it was built with safety in mind, we can't hold it to a standard that wasn't prevalent until ~20 years later

ladyanita22|2 years ago

*30 years...

Yes, we're that old.

insanitybit|2 years ago

I don't think it's that unfair, but I don't want to get into a whole thing about it, people get really upset about criticisms of the Linux kernel in my experience and I'm not looking to start my morning off with that conversation.

We can agree that C was definitely the language to be doing these things in and I don't blame Linus for choosing it.

My point wasn't to shit on Linux for its decisions, it was to think about a hypothetical world where safety built in from the start.

phh|2 years ago

> Imagine if containers were actually vm-level safe? The performance and operational simplicity of a container with the security of a VM.

As far as I know, the order of magnitudes of container security flaws from memory safety is the same as security flaws coming from namespace logic issues, and you'll have to top that with hardware issues. I'm sorry but rust or not, there will never be a world where you can 100% trust running a malware.

> Fuschia [...] is unlikely to give us the memory safety that a Rust kernel would

Well being micro kernel make it easier to migrate bits by bits, and not care about ABI

insanitybit|2 years ago

> the order of magnitudes of container security flaws from memory safety is the same as security flaws coming from namespace logic issues,

Memory safety issues are very common in the kernel, namespace logic issues are not.

GuB-42|2 years ago

More like memory safer. A kernel necessarily has a lot of unsafe parts. See: https://github.com/search?q=repo%3Allenotre%2Fmaestro+unsafe...

Rust is not a magic bullet, it just reduces the attack surface by isolating the unsafe parts. Another way to reduce the attack surface would be to use a microkernel architecture, it has a cost though.

viraptor|2 years ago

You're not really illustrating your point well with the link. If you look through the examples, they're mostly trivial and there's no clear way to eliminate them. Some reads/writes will interact with hardware and the software concepts of memory safety will never reach there because hardware does not operate at that level.

Check a few of the results. They range from single assembler line (interrupts or special registers), array buffer reads from hardware or special areas, and rare sections that have comments about the purpose of using unsafe in that place.

Those results really aren't "look how much unsafe code there is", but rather "look how few, well isolated sections there are that actually need to be marked unsafe". It's really not "a lot" - 86 cases across memory mapping, allocator, task switching, IO, filesystem and object loader is surprisingly few. (Actually even 86 is overestimated because for example inb is unsafe and blocks using it are unsafe so they're double-counted)

insanitybit|2 years ago

Practically speaking, even with `unsafe` the exploitability of rust programs is extremely difficult. With modern mitigation techniques it is required that you be able to chain multiple vulnerabilities and primitives together in order to actually reliably exploit software.

Bug density from `unsafe` is so low in Rust programs that it's just radically more difficult.

My company (not me, Chompie did the work, all credit to her for it) took a known bug, which was super high potential (write arbitrary data to the host's memory), and found it extremely difficult to exploit (we were unable to): https://chompie.rip/Blog+Posts/Attacking+Firecracker+-+AWS'+...

Ultimately there were guard pages where we wanted to write and it would have taken other vulnerabilities to actually get a working POC.

Exploitation of Rust programs is just flat out really, really hard.

yencabulator|2 years ago

Maestro has a lot of very old school hardware drivers: VGA, PS/2, IDE/ATA.

Newer hardware tends to look like just a couple of ringbuffers, and the drivers should need a lot less of these hacks. Here's an NVMe driver in Rust that intends to avoid unsafe fully: https://rust-for-linux.com/nvme-driver

arghwhat|2 years ago

While I agree, do note that a significant portion of a kernel is internal logic that can be made much safer.

badrabbit|2 years ago

> Consider that right now a docker container can't be relied upon to contain arbitrary malware, exactly because the Linux kernel has so many security issues and they're exposed to containers

If you don't run docker as root, it's fairly ok for normal software. Kernel memory safety is not the main issue with container escapes. Even with memory safety, you can have logical bugs that result in privilege escalation scenarios. Is docker itself in Rust?

Memory safety is not a magic bullet, the Linux kernel isn't exactly trivial to exploit either these days, although still not as hardened as windows (if you don't consider stuff like win32k.sys font parsing kernel space since NT is hybrid after all) in my humble opinion.

> Linux had been built with safety in mind security would be a lot simpler

I think it was, given the resources available in 1993. But if Trovalds caved in and allowed a mini-kernel or NT like hybrid design instead if hard-core monolithic unix, it would have been a game changer. In 1995, Ada was well accepted mainstream, it was memory safe and even Rust devs learned a lot from it. It just wasn't fun to use for the devs (on purpose, so devs were forced to do tedious stuff to prevent even non-memory bugs). But since it is developed by volunteers, they used what attracts the most volunteers.

The main benefit of Rust is not it's safety but its popularity. Ada has been running on missiles, missile defense, subways, aircraft, etc... for a long time and it even has a formally verified subset (SPARK).

In my opinion, even today Ada is a better suit technically for the kernel than Rust because it is time tested and version stable and it would open up the possibility easily formal-verifying parts of the kernel.

Given how widely used Linux is, it would require a massive backing fund to pay devs to write something not so fun like Ada though.

insanitybit|2 years ago

> . Kernel memory safety is not the main issue with container escapes.

I disagree, I think it is the primary issue. Logical bugs are far less common.

> the Linux kernel isn't exactly trivial to exploit either these days

It's not that hard, though of course exploitation hasn't been trivial since the 90s. We did it at least a few times at my company: https://web.archive.org/web/20221130205026/graplsecurity.com...

Chompie certainly worked hard (and is one of if not the most talented exploit devs I've met), but we're talking about a single exploit developer developing highly reliable exploits in a matter of weeks.

Throw839|2 years ago

If I remember correctly, Ada was much slower compared to C. Stuff like boundary checks on arrays has a cost.

Timber-6539|2 years ago

Containers became popular because it doesn't make much sense to be running full blown virtual machines just to run simple single process services.

You can lock down the allowed kernel syscalls with seccomp and go further with confining the processes with apparmor. Docker has good enough defaults for these 2 security approaches.

Full fat VMs are not immune to malware infection (the impact still applies to the permitted attack surface). Might not be able to easily escape to host but the risk is still there.

Alifatisk|2 years ago

> Consider that right now a docker container can't be relied upon to contain arbitrary malware, exactly because the Linux kernel has so many security issues and they're exposed to containers.

No, Docker container was never meant for that. Never use containers with untrustable binary. There is Vagrant and others for that.

mikepurvis|2 years ago

Isn't gVisor kind of this as well?

"gVisor is an application kernel for containers. It limits the host kernel surface accessible to the application while still giving the application access to all the features it expects. Unlike most kernels, gVisor does not assume or require a fixed set of physical resources; instead, it leverages existing host kernel functionality and runs as a normal process. In other words, gVisor implements Linux by way of Linux."

https://github.com/google/gvisor

cmrdporcupine|2 years ago

I like Rust and work in it fulltime, and like its memory-safety aspects but I think it's a bit of a stretch to be able to claim memory safety guarantees of any kind when we're talking about low-level code like a kernel.

Because in reality, the kernel will have to do all sorts of "unsafe" things even just to provide for basic memory management services for itself and applications, or for interacting with hardware.

You can confine these bits to verified and well-tested parts of the code, but they're still there. And because we're human beings, they will inevitably have bugs that get exploited.

TLDR being written in Rust is an improvement but no guarantee of lack of memory safety issues. It's all how you hold the tool.

ho_schi|2 years ago

Yep. And tooling to secure C improved a lot in recent years. The Address-Sanitizer is a big improvement. I’m looking forward that C++ improves as language itself because it was already improved (smart-pointers, RAII, a lot of edge cases regarding sequencing) and they seem to be willing to modify the actual language. This opens a path for project to migrate from C to C++. A language inherits a lot from its introduction (strength/weak) but also changes a lot.

Every interaction with hardware (disk, USB, TCP/IP, graphics…) need to do execute unsafe code. And we have firmware. Firmware is probably a underestimate issue for a long time :(

Aside from errors caused by undetected undefined behavior all kinds of errors remain possible. Especially logic errors. Which are probably the biggest surface?

Example:

https://neilmadden.blog/2022/04/19/psychic-signatures-in-jav...

Honestly I struggle to see the point in rewriting C++ code with Java just for the sake of doing it. Probably improving test coverage for the C++ implementation would have been less work and didn’t created the security issue first.

That being said. I want to see an #unsafe and #safe in C++. I want some hard check that the code is executing only defined. And modern compilers can do it for Rust. Same applies to machine-dependent/implementation defined code which isn’t undefined but also can be dangerous.

insanitybit|2 years ago

I've responded to the central point of "there will still be 'unsafe'" here: https://news.ycombinator.com/item?id=38853040

jmakov|2 years ago

Hasn't Kata containers solved this probl: https://github.com/kata-containers/kata-containers ?

insanitybit|2 years ago

Kata is an attempt at solving this problem. There are problems:

1. If using firecracker then you can't do nested virtualization

2. You still have the "os in an os" problem, which can make it operationally more complex

But Kata is a great project.

unknown|2 years ago

[deleted]

giancarlostoro|2 years ago

I didn't know Firecracker existed, that's really awesome. Looks to be in Rust as well. I'll have to look at how this differs from the approach that Docker uses, my understanding is that Docker uses cgroups and some other built-in Linux features.

kossTKR|2 years ago

Has there ever been any examples of malware/viruses jumping around through levels like this?

I'm honestly interested to know, because it sounds like a huge deal here, but in my laymans ears very cool and sci fi!

maayank|2 years ago

I’m interested of reading more. Where can I find the blog posts?

insanitybit|2 years ago

https://web.archive.org/web/20221130205026/graplsecurity.com...

The company no longer exists so you can find at least some of them mirrored here:

https://chompie.rip/Blog+Posts/

The Firecracker, io_uring, and ebpf exploitation posts.

Chompie was my employee and was the one who did the exploitation, though I'd like to think I was at least a helpful rubber duck, and I did also decide on which kernel features we would be exploiting, if I may pat myself on the back ever so gently.

unknown|2 years ago

[deleted]

scoot|2 years ago

> a docker container can't be relied upon to contain arbitrary malware

"to not contain"?

Edit to contain (ahem!) the downvotes: I was genuinely confused by the ambiguous use of "contain", but comments below cleared that up.

OscarCunningham|2 years ago

They're using 'contain' to mean 'keep isolated'. If you put some malware in a docker container, you can't rely on docker to keep the rest of your system safe.

quickthrower2|2 years ago

a docker image can’t be relied on to not contain malware and a docker container can’t be relied on to contain malware.

unknown|2 years ago

[deleted]