top | item 46505599

(no title)

neoCrimeLabs | 1 month ago

This is a well understood and well documented subject. Do your own research.

Start here to help give you ideas for what to research:

https://linuxsecurity.com/features/what-is-a-container-escap...

discuss

order

quotemstr|1 month ago

This kind of response isn't helpful. He's right to ask about the motivations for the claim that containers in general are "not a sandbox" when the design of containers/namespaces/etc. looks like it should support using these things to make a sandbox. He's right to be confused!

If you look at the interface contract, both containers and VMs ought to be about equally secure! Nobody is an idiot for reading about the two concepts and arriving at this conclusion.

What you should have written is something about your belief that the inter-container, intra-kernel attacker surface is larger than the intra-hypervisor, inter-kernel attack surface and so it's less likely that someone will screw up implementing a hypervisor so as to open a security hole. I wouldn't agree with this position, but it would at least be defensible.

Instead, you pulled out the tired old "education yourself" trope. You compounded the error with the weasely "are considered" passive-voice construction that lets you present the superior security of VMs as a law of nature instead of your personal opinion.

In general, there's a lot of alpha in questioning supposedly established "facts" presented this way.

ashishb|1 month ago

> This is a well understood and well documented subject. Do your own research.

Anything including GNU/Linux kernel can be broken with such security vulnerabilities.

This is not a weakness in the design of containers. `npm install`, on the other hand, is broken by design (due to post-install.

neoCrimeLabs|1 month ago

> This is not a weakness in the design of containers.

Partially correct.

Many container escapes are also because the security of the underlying host, container runtime, or container itself was poorly or inconsistently implemented. This creates gaps that allow escapes from the container. There is a much larger potential for mistakes, creating a much larger attack surface. This is in addition to kernel vulnerabilities.

While you can implement effective hardening across all the layers, the potential for misconfiguration is still there, therefore there is still a large attack surface.

While a virtual host can be escaped from, the attack surface is much smaller, leaving less room for potential escapes.

This is why containers are considered riskier for a sandbox than a virtual host. Which one you use, and why, really should depend on your use case and threat model.

Sad to say it, a disappointing amount of people don't put much hardening into their container environments, including production k8s clusters. So it's much easier to say that a virtual host is better for sandboxing than containers, because many people are less likely to get it wrong.

coppsilgold|1 month ago

Escaping a properly set up container is a kernel 0day. Due to how large the kernel attack surface is, such 0days are generally believed to exist. Unless you are a high value target, a container sandbox will likely be sufficient for your needs. If cloud service providers discounted this possibility then a 0day could be burned to attack them at scale.

Also, you can use the runsc (gvisor) runtime for docker, if you are careful not to expose vulnerable protocols to the container there will be nothing escaping it with that runtime.

neoCrimeLabs|1 month ago

You start with the assumption of "properly set up container". Also I believe you are oversimplifying the attack surface.

A container escape can be caused by combinations of breakdowns in several layers:

- Kernel implementation - aka, a bug. It's rare, but it happens

- Kernel compile time options selected - This has become more rare, but it can happen

- Host OS misconfiguration - Can be a contributing factor to enabling escapes

- Container runtime vulnerability - A vulnerability in the runtime itself

- Container runtime misconfiguration - Was the runtime configured properly?

- Individual container runtime misconfiguration - Was the individual container configured to run securely?

- Individual Container build - what's in the container, and can be leveraged to attack the host

- Running container attack surface - What's the running container's attack surface

The last two are included to be complete, but in the case of the original article running untrusted python code makes them irrelevant in this circumstance.

My point you must consider the system as a whole to consider its overall attack surface and risk of compromise. There is a lot more that can go wrong to enable a container escape than you implied.

There are some people who are knowledgeable enough to ensure their containers are hardened at every level of the attack surface. Even then, how many are diligent enough to ensure that attention to detail every time? how many automate their configurations?

Most default configurations are not hardened as a compromise to enable usability. Most people who build containers do not consider hardening every possible attack surface. Many don't even know the basics. Most companies don't do a good job hardening their shared container environments - often as a compromise to be "faster".

So yeah, a properly set up container is hard to escape.

Not all containers are set up properly - I'd argue most are not.

eyberg|1 month ago

> Escaping a properly set up container is a kernel 0day.

Not it is not. In fact many of the container escapes we see are because of bugs in the container runtimes themselves which can be quite different in their various implementations. CVE-2025-31133 was published 2? months ago and had nothing at all do with the kernel - just like many container escapes don't.

theamk|1 month ago

Note this lists 3 vulnerabilities as an example: CVE-2016-5195 (Dirty COW), CVE-2019-5736 (host runc override) and CVE-2022-0185 (io_uring escape)

Out of those, only first one is actually exploitable in common setups.

CVE-2019-5736 requires either attacker-controlled image or "docker exec". This is not likely to be the case in the "untrusted python" use case, nor in many docker setups.

CVE-2022-0185 is blocked by seccomp filter in default installs, so as long as you don't give your containers --privileged flags, you are OK. (And if you do give this flag, the escape is trivial without any vulnerabilities)

ranger_danger|1 month ago

The burden of proof lies with the person making empirically unfalsifiable claims.