How to Escape a Container

[+] remram|2 years ago|reply

1 and 7 need SYS_ADMIN, not available by default in any container runtime.

2 needs Docker socket, it's explicitly meant for running other workloads

3 needs shared PID namespace in addition to SYS_PTRACE, neither is granted by default by any container runtime

4 needs SYS_MODULE, again no one has that

5 and 6 need DAC_READ_SEARCH, no one grants that, no one uses that

None of those seem like vulnerabilities or things that would be available without the admin taking explicit steps to specifically want to allow escaping. Being root in the container would not be enough to get any of those capabilities.

[+] clvx|2 years ago|reply

I've seen people who use testcontainers and run their CI workloads in containers abusing the docker.sock mounting so they can spin up the tests. The anti-pattern of using docker.sock has been always a threat because when docker got popularity in CI/CD systems it was the easiest way to have a platform independent way to spin up isolated environments. In my perception, this was a very common pattern in Jenkins a few years ago.

[+] ipdashc|2 years ago|reply

Exactly! This is a huge pet peeve of mine. I feel like I'm going crazy when I read articles like this one (that is, any of the similar "how to escape containers" blog posts). With the exception of a rare few that actually describe 0-days, the overwhelming majority seem to be exactly like this.

It confuses the heck out of me because red-team-type people will shout from the mountaintops about how insecure containers are and how absolutely trivial it is to break out of them, but when I go look for an example of that trivialness, all I find is stuff like this. Not that these aren't at all useful techniques - there are plenty of containers with --privileged, or a Docker socket mount, etc. But surely this doesn't apply to >90% of containers out there, especially ones that are exposed to the Internet. Your average Redis or Nginx container, or some container running a Python or Node webapp, is not going to have Docker mounted or some weird capability added. Sure, misconfigs happen, us sysadmins get lazy, but this is really common-sense stuff. It feels almost unfair to "blame" it on the container.

Of course, as mentioned, there are 0-days that allow for container breakout, and those are the truly scary stuff. But they seem to be few and far between, and they get publicized semi-widely and patched pretty quickly when they are found.

So to this day I don't really understand all the security folks who act like they (as in any decent attacker, not just a nation-state with 0days up the wazoo) could break out of a container with their eyes closed, while the only material I can find on the open Internet is stuff like this. Am I just looking in the wrong places?

[+] cyrnel|2 years ago|reply

> to specifically want to allow escaping

Not really unfortunately. A novice sysadmin granting someone's request to add some permission called SYS_MODULE may not know that it could be equivalent to full root access. That's why posts like this are important for education.

[+] t8sr|2 years ago|reply

As someone who works in security (both pre-fail and post-fail), 95% of incidents happen because things are not configured correctly. Actual vulnerabilities in Linux are expensive, and any attacker competent enough to have one, also will not burn it needlessly, if they can get in because a SWE with a deadline gave the container SYS_PTRACE.

I think your response betrays a broken line of communication in our industry. People tend to assume most production environments are configured by people who knew what they were doing, were deploying a well-behaved application and had time and management support to do a good job. That's almost never the case, even in well-funded, well-regarded tech companies.

[+] yjftsjthsd-h|2 years ago|reply

I find it relatively comforting that every single one of these requires disabling security features, though it is a good reminder of why certain things are dangerous to give containers and should be avoided.

[+] Retric|2 years ago|reply

It doesn’t mention zero day exploits, but those still occur.

[+] luma|2 years ago|reply

Step 1: already be root.

[+] tpetry|2 years ago|reply

Any of the capabilities needed by the exploits is in linux only available to root users. And are now granted to a container.

So these ”escapes“ are basically giving a container full access to the host and using that. None of them are enabled by default.

[+] fulafel|2 years ago|reply

This post seems confused and misleading. It only lists ways to escape containers with various non-default capabilities added, but doesn't address the other more realistic and seen in the wild ways (eg from uid=0 confusion due to disuse of user namespaces, kernel privilege escalation bugs, etc).

[+] t8sr|2 years ago|reply

I don’t think it’s misleading - working in security, I can tell you that containers are misconfigured with too many privileges everywhere you look. SWEs like focusing on cool stuff, like kernel exploits, to the detriment of basic production security. From the point of view, this is an article I think many SWEs could benefit from reading, especially on teams that do their own deployments.

[+] arkadiyt|2 years ago|reply

Not mentioned: use a kernel N-day, of which there are many. Patch your hosts folks

[+] insanitybit|2 years ago|reply

For many, many services, requiring an entire separate exploit against the kernel is a huge win. So many services can just be dropped into a container with roughly 0 effort. If you want a higher level of security it's going to take significantly more effort.

[+] Sheeny96|2 years ago|reply

I think there's some version of the dunning krueger effect going on in these comments - assuming that no one would include this number of security flaws unless intentionally. Perhaps it's that this forum tends to attract people more engaged in the CS space that wouldn't do this - but I've seen enough brute forcing in the wild to know that this ABSOLUTELY exists where a "just make it work" mentality is present.

[+] danrl|2 years ago|reply

I genuinely thought this link was about escaping a standardized steel shipping container, something I recently had to seriously consider. In that regard a disappointing click.

I also wrote my own docker-like containerization code for educational purposes a while ago, so container has both these meanings for me. Yet, me brain was expecting a physical escape story. Brains are funny!

[+] jhiggins777|2 years ago|reply

Thought the same thing.

I, however, have not written any containerization code.

[+] timost|2 years ago|reply

Using rootless podman limits the blast radius of a container escape.

Also many of the cappabilities described in this article aren't compatible with a rootless user deployment scénario.

[+] rompledorph|2 years ago|reply

Whats the state of lightweight VMs? Can they replace containers yet?

Not sharing the kernel with the host os (or other containers) is a huge security boundary.

[+] ptx|2 years ago|reply

They would still be vulnerable to the sort of attack described in the article, though: If the host deliberately hands the guest a socket it can use to execute commands as root on the host, there's nothing that can be done to make it secure.

[+] SpookyChoice|2 years ago|reply

KataContainers and gvisor come to mind. KataContainers really spin up VMs with various optimizations. Gvisor uses a reimplementation of the kernel syscall interface in go, which is also a pretty interesting idea.

[+] b112|2 years ago|reply

To me, a guy that's been doing this for decades, this is a weird thing to say.

A bit of debootstrap, a few apt-get commands, and copying in config files, and you have a lightweight VM, minimal image.

Something people have been doing for 20 years.

There are also sorts of tricks, such as having two images, one for the app layer, one for the OS, which makes the deploy for app updates faster.

I'm not even sure why people care about image size all that much. You copy it to your local cluster, then deploy from there.

[+] munawwar|2 years ago|reply

AWS lambda and fly.io uses firecracker VMs internally. So I think it can replace containers to some extent.

[+] denton-scratch|2 years ago|reply

All these escapes seem to involve Docker. It looks to me as if at least some of them are strictly Docker-dependent, but I've never used Docker, so I'm no expert.

[+] unknown|2 years ago|reply

[deleted]

[+] adriangrigore|2 years ago|reply

Why escape?

[+] c0pium|2 years ago|reply

You are being downvoted right now, but this is a great point. If you have execution control in someone’s container, use the containers existing secrets to achieve your goals. I don’t need to escape your web apps container to steal all of the contents of the backend database.

[+] heyoni|2 years ago|reply

All those processes living in our containers are going to want the red pill eventually. Haven’t you seen the matrix?

But seriously though, it’s so you can write exploits or satisfy that curious itch when working with a cloud service.

[+] badrabbit|2 years ago|reply

Useful: https://github.com/stealthcopter/deepce

[+] unknown|2 years ago|reply

[deleted]

[+] Mtinie|2 years ago|reply

I found the content to be of interest and gave me new things to think about.

But honestly, I was hoping before I clicked it that this was going to be about how to escape from the inside of a shipping container.

[+] itishappy|2 years ago|reply

Apparently the answer to escaping from a shipping container is...

You don't, they cannot be opened from inside once locked. Also they're airtight, so bang on the walls and hope help arrives before you suffocate.

That's more nightmare inducing than I was hoping.

[+] phero_cnstrcts|2 years ago|reply

Me too, that skill could come in handy one day.

[+] lurquer|2 years ago|reply

Likewise. Damn. I guess I’ll stay stuck in here for a while longer.

[+] fbdab103|2 years ago|reply

I think the only answer is you fire up the sawzall your captor neglected to remove before they locked you inside. Also hope there are some earplugs, because you are going to be deaf before you can cut an escape hatch.

[+] unknown|2 years ago|reply

[deleted]

[+] AndyMcConachie|2 years ago|reply

I clicked on this thinking I would learn how to escape a shipping container if I ever got trapped in one ;)

[+] shepherdjerred|2 years ago|reply

I had the exact same thought! That would make a very interesting article.

[+] unknown|2 years ago|reply

[deleted]

[+] paxys|2 years ago|reply

Containers are not a security boundary.

Any system that treats them as such is inherently compromised.

[+] harporoeder|2 years ago|reply

All of these escapes rely on some obvious explicit reduction of the isolation guarantees. If you know how to escape a simple docker container invoked with default parameters such as `docker run --rm -it ubuntu /bin/bash` I'm sure many people would be interested.

[+] ajross|2 years ago|reply

Is putting processes in separate memory spaces not a security boundary to you? Isolating separate app responsibilities in separate UIDs? Filesystem permission bits? All those are "weaker" boundaries than containers. Do you really claim that these have ZERO security value?

That's silly. Of course containers are a security boundary. They have advantages and disadvantages. Treat them as tools and not slogans.

[+] carbotaniuman|2 years ago|reply

What is the security boundary then? Everywhere I read treats them as a security boundary for say, untrusted code.

[+] johncolanduoni|2 years ago|reply

This is true, but the exploits listed in this article are poor evidence of this. If it was as simple as not giving Linux containers any capabilities or host sockets they would be a decent security boundary.

I find it super frustrating that we're stuck with kernels with inherent weaknesses to their security approach that we have to re-implement them in userspace in one way or another (gVisor, Firecracker, etc.) just to get the hardware-provided userspace/kernel boundary to work properly.

[+] brutal_chaos_|2 years ago|reply

A container is a security boundry. However, no security boundry is perfect and more boundries must be put in place as well, depending on your threat model.

edit: typo

[+] insanitybit|2 years ago|reply

Containers are absolutely a security boundary, and an excellent one at that - at least, Docker containers are.

1. You get file, process, and network namespaces, which are a security boundary

2. You get a seccomp filter, which is a security boundary

The "containers are not a security boundary" meme needs to die.

Elsewhere you mention that "containers are not sufficient for untrusted code" but that's a very specific and very niche threat model. Most people don't say "send me a binary and I'll execute it", or have arbitrary RCE + multitenancy concerns.

Containers aren't sufficient for multi-tenant RCE because the RCE is by design so 100% of your security pressure is on the container at that point. In the vast majority of cases you're dealing with servers that don't intend to allow arbitrary code execution, and containers are an extremely easy way to drive up the cost of an attack given that the attacker has already spent a lot of time and money on the RCE.

SELinux is also not sufficient for the "RCE by design" threat model - is SELinux not a security boundary?

Further, containers can limit the impact of remote vulnerabilities like path traversal attacks, since they have file isolation by default.

edit: I see elsewhere that there's a real lack of clarity here.

First off, a security boundary can be meaningfully defined as a limitation on an attacker that does not have a way around it without additional exploitation.

So the main reason why people have said "containers are not a security boundary" is because:

a) Very, very early on, escaping a container was trivial - like you could just ask to leave and you'd be out.

b) There were some blog posts basically saying "containers aren't sufficient for multi-tenancy" where arbitrary users can run arbitrary code on the same host. This is still the case today - but it's also an extremely rare threat model.

Why would containers not be sufficient for (b) ? Because the majority of the Linux kernel is still exposed to the attacker within a container - the vast majority of system call interfaces are exposed (but seccomp removes a number of these, which is nice). The Linux kernel is not at all sufficiently hardened against attackers who can make arbitrary system calls, therefor containers are not sufficient against those attackers. If you give the attacker RCE by default (ie: your service is "Send me a binary and i'll run it") then an attacker can spend all of their time and money just on a local privesc, which isn't crazy difficult.

Since the cost of RCE in an RCE-aaS is 0 the consensus is that containers aren't strong enough for RCE-aaS threat models. In that case use Firecracker or gVisor or a dedicated host.

Otherwise, RCE costs tend to be pretty high and having to develop an additional LPE on top of one is, at minimum, quite a pain for many attackers.

Containers are extremely easy to deploy software into, something like a Firecracker VM is not. Containers are basically just processes, so you can monitor them and manage them trivially. Monitoring and managing VMs with processes inside of them is obviously harder. So I think the 'bang for your buck' with containers is extremely solid.

[+] rjzzleep|2 years ago|reply

They were on Solaris no?

[+] SoftTalker|2 years ago|reply

> Containers are not a security boundary.

The name is at least misleading if not wrong, then? What do they "contain"?

[+] 10000truths|2 years ago|reply

This kind of reductive dogma is meaningless FUD. A container is as secure as its underlying implementation.

[+] imetatroll|2 years ago|reply

Is gvisor considered sufficient when running 3rd-party code? Are there other measures that should be taken in addition to using gvisor?

[+] 2OEH8eoCRo0|2 years ago|reply

Of course they are. There is debate over how good of a boundary they are but even caution tape is a security boundary.

135 comments