top | item 41070870

Unfashionably secure: why we use isolated VMs

305 points| mh_ | 1 year ago |blog.thinkst.com

243 comments

order
[+] PedroBatista|1 year ago|reply
As a permanent "out of style" curmudgeon in the last ~15 years, I like that people are discovering that maybe VMs are in fact the best approach for a lot of workloads and the LXC cottage industry and Docker industrial complex that developed around solving problems created by themselves or solved decades ago might need to take a hike.

Modern "containers" were invented to make things more reproducible ( check ) and simplify dev and deployments ( NOT check ).

Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow, I didn't dig too deep into this is practice, maybe I'm afraid to learn the contrary, but I hope not.

Either way Docker is "fine" but WAY overused and overrated IMO.

[+] compsciphd|1 year ago|reply
As the person who created docker (well, before docker - see https://www.usenix.org/legacy/events/atc10/tech/full_papers/... and compare to docker), I argued that it wasn't just good for containers, but could be used to improve VM management as well (i.e. a single VM per running image - seehttps://www.usenix.org/legacy/events/lisa11/tech/full_papers...)

I then went onto built a system with kubernetes that enabled one to run "kubernetes pods" in independent VMs - https://github.com/apporbit/infranetes (as well as create hybrid "legacy" VM / "modern" container deployments all managed via kubernetes.)

- as a total aside (while I toot my own hort on the topic of papers I wrote or contributed to), note the reviewer of this paper that originally used the term Pod for a running container - https://www.usenix.org/legacy/events/osdi02/tech/full_papers... - explains where Kubernetes got the term from.

I'd argue that FreeBSD Jails / Solaris Zones (Solaris Zone/ZFS inspired my original work) really aren't any more secure than containers on linux, as they all suffer from the same fundamental problem of the entire kernel being part of one's "tcb", so any security advantage they have is simply due lack of bugs, not simply a better design.

[+] topspin|1 year ago|reply
Isn't this discussion based on a false dichotomy? I, too, use VMs to isolate customers, and I use containers within those VMs, either with or without k8s. These tools solve different problems. Containers solve software management, whereas VMs provide a high degree of isolation.

Container orchestration is where I see the great mistake in all of this. I consider everything running in a k8s cluster to be one "blast domain." Containers can be escaped. Faulty containers impact everyone relying on a cluster. Container orchestration is the thing I believe is "overused." It was designed to solve "hyper" scale problems, and it's being misused in far more modest use cases where VMs should prevail. I believe the existence of container orchestration and its misapplication has retarded the development of good VM tools: I dream of tools that create, deploy and manage entire VMs with the same ease as Docker, and that these tools have not matured and gained popularity because container orchestration is so easily misapplied.

Strongly disagree about containers and dev/deployment ("NOT check"). I can no longer imagine development without containers: it would be intolerable. Container repos are a godsend for deployment.

[+] everforward|1 year ago|reply
> Modern "containers" were invented to make thinks more reproducible ( check ) and simplify dev and deployments ( NOT check ).

I do strongly believe deployments of containers are easier. If you want something that parallels a raw VM, you can "docker run" the image. Things like k8s can definitely be complicated, but the parallel there is more like running a whole ESXi cluster. Having done both, there's really only a marginal difference in complexity between k8s and an ESXi cluster supporting a similar feature set.

The dev simplification is supposed to be "stop dealing with tickets from people with weird environments", though it admittedly often doesn't apply to internal application where devs have some control over the environment.

> Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow

I would be interested to hear how you use them. From my perspective, raw jails/zones are missing features and implementing those features on top of them ends up basically back at Docker (probably minus the virtual networking). E.g. jails need some way to get new copies of the code that runs in them, so you can either use Docker or write some custom Ansible/Chef/etc that does basically the same thing.

Maybe I'm wrong, and there is some zen to be found in raw-er tools.

[+] anonfordays|1 year ago|reply
>Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow, I didn't dig too deep into this is practice, maybe I'm afraid to learn the contrary, but I hope not

Having run both at scale, I can confirm and assure you they are not as secure as VMs and did not produce sane devops workflows. Not that Docker is much better, but it is better from the devops workflow perspective, and IMHO that's why Docker "won" and took over the industry.

[+] markandrewj|1 year ago|reply
I wish people would stop going on about BSD jails as if they are the same. I would recommend at least using jails first. Most people using container technologies are well versed in BSD jails, as well as other technologies such as LXD, CRI-O, Micro VM's, and traditional virtualization technologies (KVM).

You will encounter rough edges with any technology if you use it long enough. Container technologies require learning new skills, and this is where I personally see people often get frustrated. There is also the lean left mentality of container environments, where you are expected to be responsible for your environment, which is difficult for some. I.E. users become responsible for more then in a traditional virtualizated environment. People didn't stop using VM's, they just started using containers as well. What you should use is dependent on the workload. When you have to manage more then a single VM, and work on a larger team, the value of containers becomes more apparent. Not to mention the need to rapidly patch and update in today's environment. Often VM's don't get patched because applications aren't architected in a way to allow for updates without downtime, although it is possible. There is a mentality of 'if it's not broke, don't fix it'. There is some truth that virtualized hardware can provide bounds of seperation as well, but other things like selinux also enforce these boundaries. Not to mention containers are often running inside VM's as well.

Using ephemeral VM's is not a new concept. The idea of 'cattle vs pets', and cloud, was built on KVM (OpenStack/AWS).

[+] lkrubner|1 year ago|reply
I agree. VMs rely on old technologies, and are reliable in that way. By contrast, the move to Docker then necessitated additional technologies, such as Kubernetes, and Kubernetes brought an avalanche of new technologies to help manage Docker/Kubernetes. I am wary of any technology that in theory should make things simpler but in fact draws you down a path that requires you to learn a dozen new technologies. The Docker/Kubernetes path also drove up costs, especially the cost associated with the time needed to set up the devops correctly. Anything that takes time costs money. When I was at Averon the CEO insisted on absolutely perfect reliability and therefore flawless devops, so we hired a great devops guy to help us get setup, but he needed several weeks to set everything up, and his hourly rate was expensive. We could have just "push some code to a server" and we would have saved $40,000. When I consult with early stage startups, and they worry about the cost of devops, I point out that we can start simply, by pushing some code to a server, as if this was still 2001, and we can proceed slowly and incrementally from there. While Docker/Kubernetes offers infinite scalability, I warn entrepreneurs that their first concern should be keeping things simple and therefore low cost. And then the next step is to introduce VMs, and then use something like Packer to enable the VMs to be uses as AMIs and so allow the devops to develop to the point of using Terraform -- but all of that can wait till the product actually gains some traction.
[+] dboreham|1 year ago|reply
For me it's about the ROAC property (Runs On Any Computer). I prefer working with stuff that I can run. Running software is live software, working software, loved software. Software that only works in weird places is bad, at least for me. Docker is pretty crappy in most respects, but it has the ROAC going for it.

I would love to have a "docker-like thing" (with ROAC) that used VMs not containers (or some other isolation tech that works). But afaik that thing does not yet exist. Yes there are several "container-tool, but we made it use VMs" (firecracker and downline), but they all need weirdo special setup, won't run on my laptop, or a generic Digitalocean VM.

[+] vundercind|1 year ago|reply
Docker’s the best cross-distro rolling-release package manager and init system for services—staying strictly out of managing the base system, which is great—that I know of. I don’t know of anything that’s even close, really.

All the other stuff about it is way less important to me than that part.

[+] benreesman|1 year ago|reply
Namespaces and cgroups and LXC and the whole alphabet soup, the “Docker Industrial Complex” to borrow your inspired term, this stuff can make sense if you rack your own gear: you want one level of indirection.

As I’ve said many times, putting a container on a serverless on a Xen hypervisor so you can virtualize while you virtualize? I get why The Cloud wants this, but I haven’t the foggiest idea why people sit still for it.

As a public service announcement? If you’re paying three levels of markup to have three levels of virtual machine?

You’ve been had.

[+] tptacek|1 year ago|reply
Jails/Zones are not pretty much as secure as a VM. They're materially less secure: they leave cotenant workloads sharing a single kernel (not just the tiny slice of the kernel KVM manages). Most kernel LPEs are probably "Jail" escapes, and it's not feasible to filter them out with system call sandboxing, because LPEs occur in innocuous system calls, too.
[+] dhx|1 year ago|reply
The article doesn't read to me to be an argument about whether sharing a kernel is better or worse (multiple virtual machines each with their own kernel versus multiple containers isolated by a single kernel).

The article instead reads to me as an argument for isolating customers to their own customer-specific systems so there is no web server daemon, database server, file system path or other shared system used by multiple customers.

As an aside to the article, two virtual machines each with their own kernel are generally forced to communicate with each in more complex ways through network protocols which add more complexity and increase risk of implementation flaws and vulnerabilities existing. Two processes in different cgroups with a common kernel have other simpler communication options available such as being able to read the same file directly, UNIX domain sockets, named pipes, etc.

[+] nimish|1 year ago|reply
Clear Containers/Kata Containers/firecracker VMs showed that there isn't really a dichotomy here. Why we aren't all using HW assisted containers is a mystery.
[+] mountainriver|1 year ago|reply
Docker is fantastic and VMs are fantastic.

I honestly can’t imagine running all the services we have without containers. It would be wildly less efficient and harder to develop on.

VMs are wonderful when you need the security

[+] tomjen3|1 year ago|reply
If anything Docker is underused. You should have a very good reason to make a deploy that is not Docker, or (if you really need the extra security) a VM that runs one thing only (and so is essentially a more resource requiring Docker).

If you don’t, then it becomes much harder to answer the question of what exactly is deployed on a given server and what it takes to bring it up again if it goes down hard. If you but everything in Docker files, then the answer is whatever is set in the latest docker-compose file.

[+] ranger207|1 year ago|reply
Docker's good at packaging, and Kubernetes is good at providing a single API to do all the infra stuff like scheduling, storage, and networking. I think that if someone sat down and tried to create a idealized VM management solution that covered everything between "dev pushes changes" to "user requests website" then it'd probably have a single image for each VM to run (like Docker has a single image for each container to run) then management of VM hosts, storage, networking, and scheduling VMs to run on which host would wind up looking a lot like k8s. You could certainly do that with VMs but for various path dependency reasons people do that with containers instead and nobody's got a well adopted system for doing the same with VMs
[+] cryptonector|1 year ago|reply
Jails/Zones are just heavy-duty containers. They're still not VMs. Not that VMs are enough either, given all the side-channels that abound.
[+] TheNewsIsHere|1 year ago|reply
I feel the exact same way.

There are so many use cases that get shoved into the latest, shiniest box just because it’s new and shiny.

A colleague of mine once suggested running a CMS we manage for customers on a serverless stack because “it would be so cheap”. When you face unexpected traffic bursts or a DDoS, it becomes very expensive, very fast. Customers don’t really want to be billed per execution during a traffic storm.

It would also have been far outside the normal environment that CMS expects, and wouldn’t have been supported by any of our commercial, vendored dependencies.

Our stack is so much less complicated without running everything in Docker, and perhaps ironically, about half of our stack runs in Kubernetes. The other half is “just software on VMs” we manage through typical tools like SSH and Ansible.

[+] ganoushoreilly|1 year ago|reply
Docker is great, way overused 100%. I believe a lot of it started as "cost savings" on resource usage. Then it became the trendy thing for "scalability".

When home enthusiasts build multi container stacks for their project website, it gets a bit much.

[+] icelancer|1 year ago|reply
Same. We're still managing ESXi here at my company. Docker/K8s/etc are nowhere close to prod and probably never will be. Been very pleased with that decision.

I will say that Docker images get one HUGE use case at our company - CUDA images with consistent environments. CUDA/pytorch/tensorflow hell is something I couldn't imagine dealing with when I was in college studying CS a few decades ago.

[+] m463|1 year ago|reply
I've always hated the docker model of the image namespace. It's like those cloud-based routers you can buy.

Docker actively prevents you from having a private repo. They don't want you to point away from their cloud.

Redhat understood this and podman allows you to have a private docker infrastructure, disconnected from docker hub.

For my personal stuff, I would like to use "FROM scratch" and build my personal containers in my own ecosystem.

[+] diego_sandoval|1 year ago|reply
> and Docker industrial complex that developed around solving problems created by themselves or solved decades ago.

From my perspective, it's the complete opposite: Docker is a workaround for problems created decades ago (e.g. dynamic linking), that could have been solved in a better manner, but were not.

[+] ktosobcy|1 year ago|reply
> Modern "containers" were invented to make things more reproducible ( check ) and simplify dev and deployments ( NOT check ).

Why?

I have my RPi4 and absolutely love docker(-compose) - deploying stuff/services on in it just a breeze compared to previous clusterf*k of relying on system repository for apps (or if something doesnt work)... with docker compose I have nicely separated services with dedicated databases in required version (yes, I ran into an issue that one service required newer and another older version of the database, meh)

As for development - I do development natively but again - docker makes it easier to test various scenarios...

[+] turtlebits|1 year ago|reply
Honestly, it really doesn't matter whether it's VMs or Docker. The docker/container DX is so much better than VMWare/QEMU/etc. Make it easy to run workloads in VMs/Firecracker/etc and you'll see people migrate.
[+] gryfft|1 year ago|reply
I've been meaning to do a bhyve deep dive for years, my gut feelings being much the same as yours. Would appreciate any recommended reading.
[+] tiffanyh|1 year ago|reply
Are Jails/Zones/Docker even security solutions?

I always used them as process isolation & dependency bundling.

[+] analognoise|1 year ago|reply
What do you think of Nix/NixOS?
[+] cryptonector|1 year ago|reply
I mean, yeah, but things like rowhammer and Spectre/Meltdown, and many other side-channels are a big deal. VMs are not really enough to prevent abuse of the full panoply of side-channels known and unknown.
[+] ploxiln|1 year ago|reply
> we operate in networks where outbound MQTT and HTTPS is simply not allowed (which is why we rely on encrypted DNS traffic for device-to-Console communication)

HTTPS is not allowed (locked down for security!), so communication is smuggled over DNS? uhh ... I suspect that a lot of what the customer "security" departments do, doesn't really make sense ...

[+] tptacek|1 year ago|reply
The cool kids have been combining containers and hardware virtualization for something like 10 years now (back to QEMU-Lite and kvmtool). Don't use containers if the abstraction gets in your way, of course, but if they work for you --- as a mechanism for packaging and shipping software and coordinating deployments --- there's no reason you need to roll all the way back to individually managed EC2 instances.

A short survey on this stuff:

https://fly.io/blog/sandboxing-and-workload-isolation/

[+] bobbob1921|1 year ago|reply
My big struggle with docker/containers vs VMs is the storage layer (on containers). I’m sure it’s mostly lack of experience / knowledge on my end, but I never have a doubt or concern that my storage is persistent and clearly defined when using a VM based workload. I cannot say the same for my docker/container based workloads, I’m always a tad concerned about the persistence of storage, (or the resource management in regards to storage). This becomes even more true as you deal with networked storage on both platforms
[+] stacktrust|1 year ago|reply
A modern virtualization architecture can be found in the OSS pKVM L0 nested hypervisor for Android Virtualization Framework, which has some architectural overlap with HP/Bromium AX L0 + [Hyper-V | KVM | Xen] L1 + uXen L2 micro-VMs with copy-on-write memory.

A Bromium demo circa 2014 was a web browser where every tab was an isolated VM, and every HTTP request was an isolated VM. Hundreds of VMs could be launched in a couple of hundred milliseconds. Firecracker has some overlap.

> Lastly, this approach is almost certainly more expensive. Our instances sit idle for the most part and we pay EC2 a pretty penny for the privilege.

With many near-idle server VMs running identical code for each customer, there may be an opportunity to use copy-on-memory-write VMs with fast restore of unique memory state, using the techniques employed in live migration.

Xen/uXen/AX: https://www.platformsecuritysummit.com/2018/speaker/pratt/

pKVM: https://www.youtube.com/watch?v=9npebeVFbFw

[+] mikewarot|1 year ago|reply
It's nice to see the Principle Of Least Access (POLA) in practical use. Some day, we'll have operating systems that respect it as well.

As more people wake up to the realization that we shouldn't trust code, I expect that the number of civilization wide outages will decrease.

Working in the cloud, they're not going to be able to use my other favorite security tool, the data diode. Which can positively guarantee ingress of control, while still allowing egress of reporting data.

[+] fsckboy|1 year ago|reply
just as a meta idea, i'm mystified that systems folks find it impossible to create protected mode operating systems that are protected, and then we all engage in wasteful kluges like VMs.

i'm not anti-VM, they're great technology, i just don't think it should be the only way to get protection. VMs are incredibly inefficient... what's that you say, they're not? ok, then why aren't they integrated into protected mode OSes so that they will actually be protected?

[+] jonathanlydall|1 year ago|reply
Sure, it’s an option which eliminates the possibility of certain types of errors, but it’s costing you the ability to pool computing resources as efficiently as you could have with a multi-tenant approach.

The author did acknowledge it’s a trade off, but the economics of this trade off may or may not make sense depending on how much you need to charge your customers to remain competitive with competing offerings.

[+] vin10|1 year ago|reply
> If you wouldn't trust running it on your host, you probably shouldn't run it in a container as well.

- From a Docker/Moby Maintainer

[+] ianpurton|1 year ago|reply
I've solved the same problem but used Kubernetes namespaces instead.

Each customer gets their own namespace and a namespace is locked down in terms of networking and I deploy Postgres in each namespace using the Postgres operator.

I've built an operator for my app, so deploying the app into a namespace is as simple as deploying the manifest.

[+] jefurii|1 year ago|reply
Using VMs as the unit allows them to move to another provider if they need to. They could even move to something like an on-prem Oxide rack if they wanted. [Yes I know, TFA lists this as a "false benefit" i.e. something they think doesn't benefit them.]
[+] smitty1e|1 year ago|reply
> Switching to another provider would be non-trivial, and I don’t see the VM as a real benefit in this regard. The barrier to switching is still incredibly high.

This point is made in the context of VM bits, but that switching cost could (in theory, haven't done it myself) be mitigated using, e.g. Terraform.

The brace-for-shock barrier at the enterprise level is going to be exfiltrating all of that valuable data. Bezos is running a Hotel California for that data: "You can checkout any time you like, but you can never leave" (easily).

[+] SunlitCat|1 year ago|reply
VMs are awesome for what they can offer. Docker (and the like) are kinda a lean VM for a specific tool scenario.

What I would like to see, would be more App virtualization software which isolates the app from the underlying OS enough to provide an safe enough cage for the app.

I know there are some commercial offerings out there (and a free one), but maybe someone can chime in has some opinions about them or know some additional ones?

[+] peddling-brink|1 year ago|reply
That’s what containers attempt to do. But it’s not perfect. Adding a layer like gvisor helps, but again the app is still interacting with the host kernel so kernel exploits are still possible. What additional sandboxing are you thinking of?
[+] er4hn|1 year ago|reply
One thing I wasn't able to grok from the article is orchestration of VMs. Are they using AWS to manage the VM lifecycles, restart them, etc?

Last time I looked into this for on-prem the solutions seemed very enterprise, pay the big bux, focused. Not a lot in the OSS space. What do people use for on-prem VM orchestration that is OSS?

[+] JohnCClarke|1 year ago|reply
Question: Could you get the customer isolation by running all console access through customer specific lambdas which simply add a unique (and secret) header to all requests. Then you can run a single database with sets of tables keyed by that secret header value.

Would give you very nearly as good isolation for much lower cost.

[+] osigurdson|1 year ago|reply
When thinking about multi-tenancy, remember that your bank doesn't have a special VM or container, just for you.
[+] sim7c00|1 year ago|reply
i wish nanoVMs were better. its a cool concept leveraging the actual VM extensions for security. but all the ones i've seen hardly get into user-mode, dont have stack protectors or other trivial security features enabled etc. (smap/smep) making it super insecure anyway.

maybe someday that market will boom a bit more, so we can run hypervisors with vms in there that host single application kind of things. like a BSD kernel that runs postgres as its init process or something. (i know thats oversimplified probarbly ::P).

there's a lot of room in the VM space for improvement ,but pretty much all of it is impossible if you need to load an entire OS multi-purpose-multi-user into the vm.....

[+] Melatonic|1 year ago|reply
Eventually we'll get a great system managing some form of micro VM that lots of people use and we have years of documentation and troubleshooting on

Until then the debate between VM and Containerisation will continue

[+] solatic|1 year ago|reply
There's nothing in Kubernetes and containers that prevents you from running single-tenant architectures (one tenant per namespace), or from colocating all single-tenant services on the same VM, and preventing multiple customers from sharing the same VM (pod affinity and anti-affinity).

I'm not sure why the author doesn't understand that he could have his cake and eat it too.