A few years ago I invested in a small startup called `hyper.sh`. It open sourced a container runtime called `runV` which provided exactly this: security of virtual machines plus convenience of containers.
The project later merged with Intel Clear Container to become what's now called Kata Containers (https://katacontainers.io/) and is now widely used by several Internet giants like Alibaba and Baidu.
The startup was acquired by Ant Finance a couple of years ago.
(I recorded a podcast with one of hyper.sh engineer if you can listen to Mandarin https://pan.icu/25)
Probably off topic: Back in 2014-15 at my first job, when I was working on openstack, they used to show up at the summits. They were super smart and very generous with their time when I had questions. I wondered sometime in 2020 what happened to them, I'm happy they had a decent exit.
I used runV with drone.io (on top of Media) to run distributed on-demand VM builders for GitHub enterprise (we were building physical machine images to deploy so needed VM isolation).
It actually worked great, and I've struggled to get as quite a flexible CI system at other jobs since then (the big advantage was it looked like Docker, so with compose you could either spin a metal-like nested VM or just pull in some DB containers in your build instance).
I was looking at Kata containers a few days ago. I'm pretty new to trying to use VMs/containers for services; purely hobby level. Couldn't figure out how to use them, but that's not necessarily a knock on them as I also can't get OpenBSD wireguard to work either.
I worked with their tech, testing it, and I loved the product. It was definitely ahead of its time. Similar in some ways to what Fly is doing these days, without the edge.
There are a few existing projects out there like this (running Docker images as virtual machines, specifically) if folks are interested. Slim [0] is the one I can remember off the top of my head. I think there are a couple more.
Still, neat to have the walkthrough here in this post.
As I understand the landscape here, the big enabling win of microvms is faster boot time; there's a cool qemu-lite slide deck that goes into detail about how they cut down boot time:
We use AWS's Firecracker to turn our customers Docker containers into Firecracker microvms (Firecracker is Amazon's Rust VMM, the engine for Fargate and Lambda). Anecdotally: in my dev environment, the difference between Firecracker boot times and native Docker container startup is imperceptible; the logging we do swamps the VM boot stuff. It's very fast.
> Can we somehow combine the advantages of the docker ecosystem with VMs?
Shameless plug: this is exactly what our goal is with https://kwarantine.xyz We are creating a new hypervisor (from scratch) that can run strongly isolated Docker/LXC containers.
If you're splitting hairs firecracker (aws) is an offshoot of crosvm from chrome/Google which actually was a greenfield vmm :) anyway memory safe virtualization for the win.
I had fun exploring Docker->VM conversion a while back [1], though the larger goal in my case was to be able to make the build path to custom GCP VM Images a bit simpler. Exciting to see other cases where folks are finding this sort of flow useful!
I think a lot of folks are going out of their way to misunderstand what happened. Yes there are other similar projects and containers. No, none come from a long established COMMUNITY RUN PROJECT. This is something akin to the difference between VirtualBox and OpenBSD’s vmd. Ones a product with a “free” tier, the other is a community project.
Why not run containers in VMs in containers in VMs? :)
Seriously, VMs are hardly as secure as many people want to believe unless you're utilizing enclaves and even that has vulnerabilities. I think a better approach is Seccomp and whatever other filtering makes sense.
A while back I did some looking at FreeBSD jails to try to figure out why they don't have more mindshare (especially when paired with the nigh-superpower-granting ZFS).
I came away baffled that they weren't more widely-promoted, compared with Docker and friends. After thinking about it for a while, all I can figure is they're so straightforward to use and well-documented that there's no room to make one's name, or to make a buck, re-packaging them or wrapping them in complex tools, so there's little money or glory (= personal marketing via open-source project leadership/contributions) in promoting them.
[EDIT] that is: what would be a blog post in LXC/Docker land... doesn't exist, because it's covered perfectly well in the docs. What would be a simple open-source tool... becomes a blog post, because it's short, simple, and clear enough not to merit special software, but just a quick guide to existing tools. What would be a business, becomes a simple open-source tool without enough of a difficulty/convenience "moat" to support a business.
But the attack surface of a Linux kernel is very large, is pretty unpredictable, and can't be coherently masked out with rules (my favorite example Jann Horn's VM reference count bug, which was a simple concurrency flaw in the core virtual memory system). By comparison, a Linux KVM hypervisor is not just a subset of the kernel by definition, but also a much smaller codebase, a tiny fraction of the whole kernel.
Replacing shared-kernel isolation like seccomp-filtered containers with VMs is, architecturally, simply the replacement of a large trusted computing base with a smaller one. If the overhead is acceptable, it's hard to argue with from a security perspective.
Security and performance aren't the only driving forces; there are a lot of technical and operational benefits to the abstraction and standard interfaces that you get when running stacks that might otherwise look like someone took an Xzibit meme too far.
Also remember on a modern system, there are often at least 2 additional layers at work abstracting interfaces to the "bare metal" OS already.
riobard|4 years ago
The project later merged with Intel Clear Container to become what's now called Kata Containers (https://katacontainers.io/) and is now widely used by several Internet giants like Alibaba and Baidu.
The startup was acquired by Ant Finance a couple of years ago.
(I recorded a podcast with one of hyper.sh engineer if you can listen to Mandarin https://pan.icu/25)
temp_praneshp|4 years ago
XorNot|4 years ago
It actually worked great, and I've struggled to get as quite a flexible CI system at other jobs since then (the big advantage was it looked like Docker, so with compose you could either spin a metal-like nested VM or just pull in some DB containers in your build instance).
cptnapalm|4 years ago
polskibus|4 years ago
unknown|4 years ago
[deleted]
lifty|4 years ago
eatonphil|4 years ago
Still, neat to have the walkthrough here in this post.
https://github.com/ottomatica/slim
hardwaresofton|4 years ago
https://github.com/containers/krunvm
https://github.com/weaveworks/ignite
tptacek|4 years ago
https://www.linux-kvm.org/images/d/d2/03x05B-Chao_Peng-Light...
The big win was slashing away the BIOS stuff.
We use AWS's Firecracker to turn our customers Docker containers into Firecracker microvms (Firecracker is Amazon's Rust VMM, the engine for Fargate and Lambda). Anecdotally: in my dev environment, the difference between Firecracker boot times and native Docker container startup is imperceptible; the logging we do swamps the VM boot stuff. It's very fast.
rwmj|4 years ago
bonzini|4 years ago
ashishbijlani|4 years ago
Shameless plug: this is exactly what our goal is with https://kwarantine.xyz We are creating a new hypervisor (from scratch) that can run strongly isolated Docker/LXC containers.
amscanne|4 years ago
[1] https://cappsule.github.io/ [2] https://en.wikipedia.org/wiki/Bromium#/media/File:Bromium-en...
mikepurvis|4 years ago
stefanha|4 years ago
It's powered by https://github.com/containers/libkrun.
forty|4 years ago
bhawks|4 years ago
cpach|4 years ago
https://github.com/firecracker-microvm/firecracker
jjacobson93|4 years ago
thekevjames|4 years ago
1: https://thekev.in/blog/2019-08-05-dockerfile-bootable-vm/ind...
dzonga|4 years ago
OldGoodNewBad|4 years ago
gravypod|4 years ago
justincormack|4 years ago
laurencerowe|4 years ago
https://cloud.google.com/container-optimized-os/
jonjonsonjr|4 years ago
unknown|4 years ago
[deleted]
encryptluks2|4 years ago
Seriously, VMs are hardly as secure as many people want to believe unless you're utilizing enclaves and even that has vulnerabilities. I think a better approach is Seccomp and whatever other filtering makes sense.
handrous|4 years ago
I came away baffled that they weren't more widely-promoted, compared with Docker and friends. After thinking about it for a while, all I can figure is they're so straightforward to use and well-documented that there's no room to make one's name, or to make a buck, re-packaging them or wrapping them in complex tools, so there's little money or glory (= personal marketing via open-source project leadership/contributions) in promoting them.
[EDIT] that is: what would be a blog post in LXC/Docker land... doesn't exist, because it's covered perfectly well in the docs. What would be a simple open-source tool... becomes a blog post, because it's short, simple, and clear enough not to merit special software, but just a quick guide to existing tools. What would be a business, becomes a simple open-source tool without enough of a difficulty/convenience "moat" to support a business.
tptacek|4 years ago
But the attack surface of a Linux kernel is very large, is pretty unpredictable, and can't be coherently masked out with rules (my favorite example Jann Horn's VM reference count bug, which was a simple concurrency flaw in the core virtual memory system). By comparison, a Linux KVM hypervisor is not just a subset of the kernel by definition, but also a much smaller codebase, a tiny fraction of the whole kernel.
Replacing shared-kernel isolation like seccomp-filtered containers with VMs is, architecturally, simply the replacement of a large trusted computing base with a smaller one. If the overhead is acceptable, it's hard to argue with from a security perspective.
gorkish|4 years ago
Security and performance aren't the only driving forces; there are a lot of technical and operational benefits to the abstraction and standard interfaces that you get when running stacks that might otherwise look like someone took an Xzibit meme too far.
Also remember on a modern system, there are often at least 2 additional layers at work abstracting interfaces to the "bare metal" OS already.
riobard|4 years ago
dboreham|4 years ago