Rump kernels (http://rumpkernel.org/) are essentially Unikernels for POSIX. I'm currently working on running unmodified application stacks (base firmware/"not-OS" + rump kernel + userland application) on Xen and later, KVM and bare metal.
How do rump kernels differ conceptually from recent academic work on "Library OSes" such as Microsoft's Drawbridge or Linux based Oses such as Bascule or Grapheme?
Why do you use Xen first? I didn't follow virtualization closely, but I thought KVM is going to be 'the future', because it was adopted by the Linux kernel and Red Hat.
I can't believe no one has mentioned ZeroVM[1] yet. The project page is unfortunately non-descriptive, but Wikipedia has some important details[2]:
The ZRT[ZeroVM RunTime] also replaces C date and time functions such as time to give program a fixed and deterministic environment. With fixed inputs, every execution is guaranteed to give the same result. Even non-functional programs become deterministic in this restricted environment. This makes programs easier to debug since their behavior is fixed.
I've had a play with it - there's a version of python that runs on it, and it's surprisingly usable.
> "It remains virtually impossible to create a Ruby or Python web server virtual machine image that DOESN’T include build tools (gcc), ssh, and multiple latent shell executables."
At work, our tech team has found an interesting way around this for our Python app. We build out the virtualenv in the docker container, and then run our ansible-based deployments inside the same container. With that, our virtual environments are rsync'd to the app servers so we can avoid installing developer tools.
I'm ditching virtualenvs and going with old good Debian packaging and private APT repository.
For VMs/containers that already run a single application, except for some weird edge cases, there's really no point in having a virtual environment in a virtual environment.
I have initial success with a few simpler projects, now looking into transitioning more complex ones. Not sure whenever it'll go without any hassle, but seems worth trying. At worst, I'd just waste my time and return to virtualenvs.
Forgive me if I'm totally clueless, but isn't the idea of the unikernel basically a throwback to the earliest, pre-OS days of computing when all programs needed routines to initialize the base hardware resources before they could perform tasks?
The idea of the unikernel and the libOS in general where applications can be linked with their bare minimum OS runtime and packaged is certainly nifty, but it's kind of funny that people are being so hyped over what sounds like a more advanced form of what was regularly done in mainframes 60 years ago.
Because in 60 years the OS got useless layers of abstractions when used as server OS with programming languages that come with batteries included.
If the programming language has a rich ecosystem with a runtime that is already taking care of hardware abstractions and scheduling, why replicate it a few times in lower layers?
How many schedulers or device drivers are needed to serve network requests?
And it's what's being done now on mainframes too. What was lacking in the commodity processor space was hardware support for mature hypervisors that stay out of the data plane (IBM got this right pretty close to the first try). We're almost there, but there's still some work IMO.
The article is correct that there aren't yet best practices about building minimal and secure Docker images, but it seems like switching to unikernels would be much more work. Unikernels also suffer from the lack of VM resizing and minimum VM sizes being too big in many cases.
Agreed. The real problem is with the current tools for building images.
As a proof of concept, several months ago I built a few tiny Docker images using musl libc and no package manager. But I had do deviate from the normal image build process to do so.
<future>I'd like to see $CLOUD selling micro-VMs with burstable memory in the tens of MBs, billed by actual CPU time used, similar to what AWS Lambda does.</future>
As for lack of VM resizing, that is a hypervisor/Unikernel implementation detail.
> Unikernels also suffer from the lack of VM resizing and minimum VM sizes being too big in many cases.
Could you clarify what you mean here? Is this specifically a Unikernel problem or an ecosystem problem (in terms of actually trying to deploy Unikernels in the wild)? If so, those seem like different issues and should be discussed separately.
I think the biggest problem with Unikernels that I haven't seen addressed is hypervisor inefficiency. Emulating any part of a kernel or multiple kernels will just be slower. You have 20 guests on your hosts? That's 20 probably-overlapping (uni)kernels running.
Sure you could optimize the heck out of the hypervisor, but now you've created a kernel. And your applications run on that kernel.
With containers, you have one kernel that won't have to instantiate 20 drivers for the disk subsystem. It can be smarter because it knows more about the loads. It's what kernels have been built to do since day 0.
My main concern with unikernels is that eventually the hypervisor will need to be a kernel to be any more optimized. I just worry it will be come something of a self-defeating concept.
It's certainly more elegant, although I'm skeptical of the purported performance gains as well, simply because so many optimizations have been thrown into traditional kernels.
Since the guest unikernel isn't a full kernel, the hypervisor interface is much more minimal, and the few host features it needs can be delegated to the CPU via VT-X (e.g. page table mapping).
At least, that's the dream. (I've never actually used Jailhouse or tried any of the research projects attempting this.)
This is unavoidable, because the aim for unikernels is to run on the cloud as a platform. They don't get to do or see anything in the hypervisor (or between guests) because the hypervisor is owned and managed by someone else who very much does not want you to see what's going on.
There's so much wrong with this post, I don't know where to begin. The idea that security is based on removing files, and not the holistic auditing and hardening of a system. The idea that you can't remove a compiler from a system image before packaging and deploying it (seriously? you don't know how to remove a file before you run a packager?). The idea that you have to ship an entire image to update a couple files. The idea that the entire design of an operating system (which is designed to make it easier for programs to run and interact without having to be tailor-made) is obsolete. It's like this guy has never held an operations job in his life, yet he's telling people how systems should be managed.
Actually, CoreOS is a platform designed for orchestration and management of Docker instances. It’s not intended to be used as a base image for Docker containers. Specifically, CoreOS is based on Gentoo Linux, but the recommended base Docker image is Debian.
You learned wrong. CoreOS is a derivative of Chromium OS, which uses Portage as its package manager. Simply typing "emerge" into a window does not Gentoo make, which is a bummer, because it shows a lack of research on the part of the author (that really came out in other areas, too).
It's also largely irrelevant, because CoreOS should in practice be read-only when you boot it, and you're not extremely concerned with the details of how it's put together (which its usage of Portage is).
Gentoo's Portage is incredibly powerful. Using it to build your own distro is one of its greatest strengths, which is why Google also uses it for building ChromeOS images.
The ideas here are all very interesting, but I don't think we need to even discuss issues with Docker to find the idea of an immutable server interesting.
I also don't find the "problems" with Docker overly problematic.
* The use of many images is probably(?) not an issue? Do people just use "any old base image" without further thought?
* An image of a few hundred megabytes isn't small, but it's not terribly large either.
Lastly, I see people's confusing over what CoreOS is besides the point. What it is becomes pretty apparent after taking a look at coreos.com.
Overall I really like the idea of an immutable server though!
One thing to be careful with these models is that you're moving the burden of maintaining libraries to the application code.
So instead of updating packages and what not, you rely on the developer to update the libraries and reship.
Sure its not far from today's model if the dev has to ship the whole container, but it also makes it even harder. How do you know if you have lib x or z when it's sometimes just dropped among a bunch of files? I think it's much worse. it hides the problem and makes it difficult to detect.
I'm suspecting kernels will slowly converge toward plan9-like functionality instead. It makes more sense. It's faster, more efficient, simpler.
The main barrier so far has been portability - but with more and more apps being written on very portable languages (python, Go, C#, ...) its becoming easier.
For Mirage OS, all libs are released as packages in OPAM [0], so it's really straightforward to find out which versions you're using (and manage/update/remove them). In fact, we just did a set of releases recently [1]. I'm not sure how it is for the other systems.
I've always been a proponent of autarchy in these matters, and so am intrigued by the idea of the unikernel. I'm also going to spend some time with OCaml, so Mirage looks like something that might turn out to be really fun.
Please do get involved! If you're learning OCaml, then http://realworldocaml.org is a great resource. When you start trying out Mirage, join the mailing list and let us know how you get on. Finally, to see where we can take this tech, have a look at http://nymote.org
Immutable systems are on the rise, and I'm glad that it is getting more developer mind share. I literally just wrote an article about this a couple days ago ( https://medium.com/@marknadal/rise-of-the-immutable-operatin... ) and since then I've seen like 3 other posts about it on top of HN.
> Heroku is a great example of immutable servers in action: every change to your application requires a ‘git push’ to overwrite the existing version.
Um, not it's not.
Heroku's buildpacks code caches a hell of a lot of stuff on each execution agent. Still more code has to recognise and try to repair various broken states. It's mutability, through and through.
[+] [-] mato|11 years ago|reply
Will be giving a talk on this at http://operatingsystems.io/ in London on November 25th.
[+] [-] m0th87|11 years ago|reply
[+] [-] anonymousDan|11 years ago|reply
[+] [-] chubot|11 years ago|reply
IIRC Linode (or some similar company) used to use User Mode Linux, but switched to Xen for performance reasons.
[+] [-] aespinoza|11 years ago|reply
[+] [-] listic|11 years ago|reply
[+] [-] nl|11 years ago|reply
The ZRT[ZeroVM RunTime] also replaces C date and time functions such as time to give program a fixed and deterministic environment. With fixed inputs, every execution is guaranteed to give the same result. Even non-functional programs become deterministic in this restricted environment. This makes programs easier to debug since their behavior is fixed.
I've had a play with it - there's a version of python that runs on it, and it's surprisingly usable.
[1] http://www.zerovm.org/
[2] https://en.wikipedia.org/wiki/ZeroVM
[+] [-] mercurial|11 years ago|reply
[+] [-] Cbeck527|11 years ago|reply
At work, our tech team has found an interesting way around this for our Python app. We build out the virtualenv in the docker container, and then run our ansible-based deployments inside the same container. With that, our virtual environments are rsync'd to the app servers so we can avoid installing developer tools.
[+] [-] incision|11 years ago|reply
In the context of Docker it's inconvenient, but entirely straightforward to create images that don't include those elements.
It's a matter of creating an unwieldy chain of build steps to avoid committing intermediate containers.
If/when something like this [1] gets merged things will be greatly simplified.
1: https://github.com/docker/docker/pull/8021
[+] [-] drdaeman|11 years ago|reply
For VMs/containers that already run a single application, except for some weird edge cases, there's really no point in having a virtual environment in a virtual environment.
I have initial success with a few simpler projects, now looking into transitioning more complex ones. Not sure whenever it'll go without any hassle, but seems worth trying. At worst, I'd just waste my time and return to virtualenvs.
[+] [-] calpaterson|11 years ago|reply
[+] [-] sibsibsib|11 years ago|reply
Seems to work pretty well, although I haven't tried it for anything production related.
[1] http://pex.readthedocs.org/en/latest/api/index.html
[+] [-] gtaylor|11 years ago|reply
[+] [-] vezzy-fnord|11 years ago|reply
The idea of the unikernel and the libOS in general where applications can be linked with their bare minimum OS runtime and packaged is certainly nifty, but it's kind of funny that people are being so hyped over what sounds like a more advanced form of what was regularly done in mainframes 60 years ago.
[+] [-] pjmlp|11 years ago|reply
If the programming language has a rich ecosystem with a runtime that is already taking care of hardware abstractions and scheduling, why replicate it a few times in lower layers?
How many schedulers or device drivers are needed to serve network requests?
[+] [-] monocasa|11 years ago|reply
[+] [-] wmf|11 years ago|reply
[+] [-] mwcampbell|11 years ago|reply
As a proof of concept, several months ago I built a few tiny Docker images using musl libc and no package manager. But I had do deviate from the normal image build process to do so.
http://mwcampbell.us/blog/tiny-docker-musl-images.html
[+] [-] mato|11 years ago|reply
As for lack of VM resizing, that is a hypervisor/Unikernel implementation detail.
[+] [-] amirmc|11 years ago|reply
Could you clarify what you mean here? Is this specifically a Unikernel problem or an ecosystem problem (in terms of actually trying to deploy Unikernels in the wild)? If so, those seem like different issues and should be discussed separately.
[+] [-] andrewstuart2|11 years ago|reply
Sure you could optimize the heck out of the hypervisor, but now you've created a kernel. And your applications run on that kernel.
With containers, you have one kernel that won't have to instantiate 20 drivers for the disk subsystem. It can be smarter because it knows more about the loads. It's what kernels have been built to do since day 0.
My main concern with unikernels is that eventually the hypervisor will need to be a kernel to be any more optimized. I just worry it will be come something of a self-defeating concept.
[+] [-] m0th87|11 years ago|reply
[+] [-] jdf|11 years ago|reply
https://github.com/siemens/jailhouse
Since the guest unikernel isn't a full kernel, the hypervisor interface is much more minimal, and the few host features it needs can be delegated to the CPU via VT-X (e.g. page table mapping).
At least, that's the dream. (I've never actually used Jailhouse or tried any of the research projects attempting this.)
[+] [-] rwmj|11 years ago|reply
[+] [-] justincormack|11 years ago|reply
[+] [-] peterwwillis|11 years ago|reply
[+] [-] jfb|11 years ago|reply
[+] [-] jessaustin|11 years ago|reply
Well I learned something...
[+] [-] jsmthrowaway|11 years ago|reply
It's also largely irrelevant, because CoreOS should in practice be read-only when you boot it, and you're not extremely concerned with the details of how it's put together (which its usage of Portage is).
[+] [-] Slackwise|11 years ago|reply
[+] [-] fideloper|11 years ago|reply
I also don't find the "problems" with Docker overly problematic.
* The use of many images is probably(?) not an issue? Do people just use "any old base image" without further thought? * An image of a few hundred megabytes isn't small, but it's not terribly large either.
Lastly, I see people's confusing over what CoreOS is besides the point. What it is becomes pretty apparent after taking a look at coreos.com.
Overall I really like the idea of an immutable server though!
[+] [-] zobzu|11 years ago|reply
So instead of updating packages and what not, you rely on the developer to update the libraries and reship.
Sure its not far from today's model if the dev has to ship the whole container, but it also makes it even harder. How do you know if you have lib x or z when it's sometimes just dropped among a bunch of files? I think it's much worse. it hides the problem and makes it difficult to detect.
I'm suspecting kernels will slowly converge toward plan9-like functionality instead. It makes more sense. It's faster, more efficient, simpler.
The main barrier so far has been portability - but with more and more apps being written on very portable languages (python, Go, C#, ...) its becoming easier.
[+] [-] amirmc|11 years ago|reply
[0] http://opam.ocaml.org
[1] https://github.com/ocaml/opam-repository/pull/3028
[+] [-] jfb|11 years ago|reply
[+] [-] amirmc|11 years ago|reply
[+] [-] marknadal|11 years ago|reply
[+] [-] jacques_chester|11 years ago|reply
Um, not it's not.
Heroku's buildpacks code caches a hell of a lot of stuff on each execution agent. Still more code has to recognise and try to repair various broken states. It's mutability, through and through.
[+] [-] cyberneticcook|11 years ago|reply
[+] [-] preillyme|11 years ago|reply