Is Docker ready for production?

[+] contingencies|11 years ago|reply

This appears to be a good, honest and multi-faceted review against real world requirements. It is a shame so many replies are "if it works for me and I have n instances, you must be wrong" instead of truly substantive. The author is bringing up points of concern, not rubbishing the project. FWIW as frequently shared my own summary of the area exists at http://stani.sh/walter/pfcts/ and a general architectural response to the same sorts of concerns derived at http://stani.sh/walter/pfcts/original/

[+] freshflowers|11 years ago|reply

My general impression of Docker is that for most of us, it adds complexity with very little upside. Only when you have an already complex orchestration does Docker help you reduce that complexity.

I'm expecting that to shift with both Docker and the ecosystem of tools and services around it maturing, to the point where in a few years time Docker may even be advantageous to the most simple setups.

I'm really excited about Docker, but personally I'm not expecting to have any use for it in production for another two years.

[+] robeastham|11 years ago|reply

Checkout http://deis.io/ and follow the Heroku buildpack instructions to take some of the headache out of all the DevOps questions that Dockerfiles etc might pose if you are just a humble web developer. Deis run's on CoreOS which is an excellent base OS that requires you to use Docker. CoreOS upgrades automatically in the background too.

Deis is a fully opensource PaaS inspired by Heroku. You can dip your toe in the water by spinning up a Deis cluster and then just following the Heroku inspired/compatible workflow described in the Deis docs. When you are ready to experiment Deis let's you use Dockerfiles instead of a Heroku inspired workflow.

Once you have your Deis cluster up you might even want to sidestep it entirely and run Docker containers directly on your CoreOS cluster.

Deis can be used with Vagrant and VirtualBox and it can also deploy to multiple clouds (EC2, Rackspace, Digital Ocean, Bare Metal (e.g. Kimsufi should be possible http://bit.ly/1t2PPXB). I've been playing around with it and it looks great so far.

[+] caw|11 years ago|reply

My coworker and I have been fighting with Docker for the last few weeks like the OP, so maybe HN can help us here. Right now we use Chef to provision the hosts and we run services normally. Now we're trying to introduce Docker into our test environment, so we don't need to replicate a multi-node database cluster out of machines. In the process, it seems like we're having to repeat most of our Chef recipes to create the config files, and put them in the appropriate locations to let Docker mount them as a shared volume.

Are we doing it wrong? Should everything be Docker and all of our Chef recipes should be written for Docker? Or is this right and just a natural pain of having both bare metal processes and Dockerized processes? Unlike the article, so far everything is 1 process or application per container, not a full OS.

[+] eigenrick|11 years ago|reply

The thing about Chef is that it tries to simplify the process of configuring and installing many applications that might run on a server.

Docker pretty much removes the the need for such complexity. Don't think of it as a provisioning tool, like chef. Don't think of it as a VM. Think of it as an isolated filesystem and process environment that eliminates the complexity of supporting multiple, disparate application stacks.

The storage for persistent storage with Docker is less compelling, because now you have shared resources between containers that should be isolated. That said, we run Cassandra clusters in Docker and have no trouble.

Also, If you want to re-use the work you've done with Chef, there are chef plugins for Docker which make it easy to provision Docker images with chef (solo, I think)

[+] dedene|11 years ago|reply

Has anyone come up with a way to speed up the "bundle install" step in a Docker build? The smallest change will cause this step to completely rerun, which takes a long time for a Ruby application with lots of gem dependencies.

One approach might be to base the final Docker image on another Docker image, which has a snapshot of all Rubygem dependencies at a certain point. In the depending image, the 'bundle install' will then do an incremental update and the Docker build will go a lot faster.

But I was wondering how other people are solving this?

[+] marceldegraaf|11 years ago|reply

I found this article to be useful; the author describes how he makes sure his "bundle install" result is cached until the Gemfile.lock changes.

http://ilikestuffblog.com/2014/01/06/how-to-skip-bundle-inst...

[+] hipsters_unite|11 years ago|reply

If you add the Gemfile and Gemfile.lock to the image before you add all the application files you can cache that step IIRC, then only if that step changes will bundle install run, so long as it's placed after that step.

[+] myhf|11 years ago|reply

I usually structure the Dockerfile to make the installation step cacheable, like so:

    # This only re-runs the slow installation when requirements.txt changes
    ADD requirements.txt /app/requirements.txt
    RUN pip install -r /app/requirements.txt
    
    # This re-runs every time you change any file, but is very fast
    ADD . /app/

[+] vidarh|11 years ago|reply

I do "bundle install --standalone" in a separate Docker image whenever I update dependencies, and don't install/run bundler at all in the deployment image.

A typical large Ruby app ends up dragging in tons of build dependencies that does not need to be in the final image to be able to do things like compile extensions etc.

[+] kimi|11 years ago|reply

Put the things that are going to change as the last step; use very few steps (eg dont copy files one by one) and let Docker cache the results. We use WhaleWare as a templating tool: https://github.com/l3nz/whaleware and it works great (or well enough)

[+] imdsm|11 years ago|reply

Makes sense to have two docker repositories (images).

In one you install deps, and in the other, which relies on the first, you install your app. If you change your deps, then you rebuild the first.

I'm not a Ruby dev, but this seems like it should be quite simple?

[+] shawabawa3|11 years ago|reply

What i've done is just mount the rails app in a volume, so it's not part of the docker image at all.

This means I can run multiple rails apps using the same image just by changing what volume to mount

[+] gowan|11 years ago|reply

I have a few docker containers I use as a testing environment. I ended up mounting gems from the host. They where mounted from a /tmp dir so docker would not pollute the gems on the host.

[+] vidarh|11 years ago|reply

Personally I think it shows that this is a "first look" at Docker. Much of it is much better than what the post indicates.

> The final image is 570MB big. I could not shrink it more unless I remove the whole Python and Perl stack. Since both are necessary for many system dependencies, starting with apt-get, this was not possible. I still need a way I can improve or upgrade my container.

?!? The article starts by pointing out they use immutable servers and blue/green deployment. In that context, you will not improve or upgrade the container: You build a new one. And if you want to cut build dependencies from the final container: Do the build in one container, install the build-artefacts to a volume, and use the contents of that volume to build a container without the build dependencies.

It'd be great to get "built in" support for this, but it's not hard to do.

> There’s no easy way logging with Docker.

The standard way of logging with Docker is to log to standard out, which gets captures and is accessible via "docker logs". If he'd not dismissed systemd out of hand, he'd also easily have gotten it fed into journald, with the option of having it relayed to a remote or local syslog as per his preferences.

> Let’s put it this way: as a way of provisioning a container, Dockerfile is a joke.

We don't need more complex provisioning tools. We have plenty of provisioning tools. Ultimately Dockerfiles needs to be able to specify what should be copied into the image. Everything else you can do with your standard/preferred build tools. There's no reason for Dockerfiles to try to become yet another fully featured provisioning tool.

> Forget your classic monitoring (unless you want to pull your hair with network bridges). Everything you’ll be able to monitor within the container are ports. That because you run the old school nrpe inside your host, so you won’t be able to check you actually have 8 workers running inside your container.

This is just flat out wrong. Anything running on the host can see the processes running in the container. With the right cgroup manipulation (via nsenter etc.) it can also see the mounted volumes or network space of a container, and so you can still monitor whatever you like.

> Making your application Docker compliant requires you to rethink the way it works.

Making your application take advantage of Docker, rather than treating Docker containers as sort-of VM's with less isolation requires you to rething the way it works. It's not something you need to do in one go - you can "break apart" a larger app environment piece by piece.

> The the tag nightmare begins. If I update my application and add new deps, I’ll have to update container #2. Unfortunately, how will I know I have to do that?

Uh. How does he know he has to update the machine images he deploys his applications to today? Personally I use make - tracking build dependencies is what it is for.

[+] kitsune_|11 years ago|reply

The main point he makes is valid however,

> Porting your application to Docker increases complexity. Really.

I think the main problem of Docker is that it's sold as an 'easy solution' by many bloggers who only deal with it superficially and then move on to the next big thing. There are a lot of gotchas with docker containers and the creation of clean docker images that are not immediately clear when you start out. A lot of your standard Linux know-how is not applicable.

edit: Also, there are obvious security issues that are not immediately clear to most beginners, most certainly not from the tutorials.

One of my favorites: If you provision your database container with environment variables to create a dba user, and then link your db container to your app container, voilà, your app container will now most certainly have the dba login and password inside its environment variables: https://github.com/docker/docker/issues/5169

[+] bryanlarsen|11 years ago|reply

You obviously know a lot more about Docker than I do, but I thought I'd add a couple of comments to your great list.

For size, it appears the OP started on his static compilation quest basing his image on Ubuntu. Wheezy is the standard base image in the Docker world for a reason -- it's significantly smaller. More and more images are using busybox based images, but I wouldn't want to try that with Rails + ImageMagick.

As for logging, I'd like to point out logspout: https://github.com/progrium/logspout

[+] thinkersilver|11 years ago|reply

His comments on logging are spot on. It's the only part of the post that I can agree on. You can get process monitoring as long as your tools are cgroup aware. Network monitoring is not so easy. This has been pointed out on the docker blog in the past.

Getting your logs out of docker is a PITA, though. Your best bet is to use syslog and configure each application to send their logs to a syslog server. It's a consistent widely supported way of shipping your logs around. Relying on stdout logs isn't always enough. Many applications do understand syslog out of the box but do not necessarily send important messages to standard out. Dockers management of stdout logs shouldn't be relied on at this stage.

Logspout does look interesting and I wish I knew about this a few months ago but see my above comment on stdout.

[+] incision|11 years ago|reply

>"We don't need more complex provisioning tools. We have plenty of provisioning tools."

Absolutely.

Thankfully, the Docker team seems in agreement with this based statements about avoiding making Dockerfiles "too clever" and the response to various proposals.

As you point out, most of the "issues" here are really misconceptions.

I expect it's a tough balance for any new(er) project. Maximizing exposure and adoption, but avoiding negative perceptions from being applied in ways aren't optimal.

[+] darklajid|11 years ago|reply

I would love to peek at some of your Dockerfiles if possible?

Tried building a piece of infrastructure with Docker quite some time ago and left when it just didn't click. I certainly did quite some mistakes (at that time I did try to stuff all the provisioning into the Dockerfile for example, which you agree is a bad idea?) and the official Dockerfiles were .. mixed in clarity and not quite useful as examples to me.

So, I would really love reading about how people provision and manage their Docker instances The Right Way.

[+] sciurus|11 years ago|reply

>And if you want to cut build dependencies from the final container: Do the build in one container, install the build-artefacts to a volume, and use the contents of that volume to build a container without the build dependencies.

This is similar to what Netflix, one of the early big proponents of immutable servers, does. They build a deb package, then they create an AMI with that package installed.

http://techblog.netflix.com/2013/03/ami-creation-with-aminat...

[+] Roritharr|11 years ago|reply

Thank you! I was sceptical of Docker for a while just based on a bad gut feeling, but now that you've shown a few oft the rough spots i think i might devote some time into it.

[+] imdsm|11 years ago|reply

What you wrote: Thank you! I was sceptical of Docker for a while just based on a bad gut feeling, but now that you've shown a few oft the rough spots i think i might devote some time into it.

What I read: Thank you! I've been looking for an excuse not to like docker for a while and this gives me great ammunition for my argument against it.

[+] cbsmith|11 years ago|reply

Honestly, if that is what you took from it, you should read the rest of the comments here. Much of this article didn't make much sense.

[+] lelf|11 years ago|reply

Alternatively if you don't Linux-dependent and don't mind much the fact that not everyone is writing blogposts about it, BSD’s jails is lightweight virtualisation mechanism that has been production-ready for decades. (And zfs pools and snapshots are yours too along with that.)

[+] twic|11 years ago|reply

Jails are great. But - and apologies if i'm teaching my grandmother to suck eggs here - they're the equivalent of Linux's containers, as implemented in LXC or libcontainer or whatever.

Docker is a layer on top of that - it's what prepares the file contents of the jails and looks after them while they're running. I think Docker can even manage FreeBSD jails, although i'm not certain about that.

It's not the production-readiness of Linux containers that is in question here; they're fine, although nowhere near as mature as jails. The doubt is about docker, the layer on top. If you wanted to make a comparison to jails, it would be to whatever the equivalent of Docker is in the jail ecosystem. I'm not sure what that is; either there isn't one, or it's an ad-hoc pile of site-local shell scripts.

[+] dscrd|11 years ago|reply

Have you tried CBSD? Does it compete well against Docker? The documentation seems quite lacking, but then again, so is docker's at the moment too.

[+] kimi|11 years ago|reply

Two weeks? we have been running Docker in production for over one year and have thousands of separate instances. No big issues os far. All that is said is true (or at least points to an area of concern), it's just a matter of weighting pro's and con's.

[+] mikepurvis|11 years ago|reply

What's the present best practice as far as defining an application as a set of multiple inter-linking containers? I know of at least Shipyard and Panamax, and don't especially care for either of them.

Are there others? Is anyone clearly winning in this space?

As it stands, it seems far easier to just create a monolithic container with a bunch of running processes and a supervisord than to break up the pieces and then have a more complicated deployment.

[+] splawn|11 years ago|reply

We are using fig. Just starting to get our feet wet with docker, so I don't know if its considered "best practice" though. It seems to meet our needs so far.

http://www.fig.sh/

[+] charlesmarshall|11 years ago|reply

if you want a heroku / 12 factor app style setup, tryout deis.io

[+] IanCal|11 years ago|reply

At lot of parts of this don't really make much sense:

> As I’m looking for a way to build a container without having it host the whole build environment (such as Puppet modules, ssh private keys to the Git repository, etc)

Why would you need to check out something from git within the container? That sounds like an unusual setup. Normally you'd check out your repo, then run a build from within it.

> A Dockerfile to build a container with a basic Ruby stack.

> A Dockerfile from #1 to build the deps: checkout the application on Github, install the packages, run bundle install, then remove the application. I’ll be able to share this container with both applications. Big win!

> A Dockerfile from #2 to download the application from Github, then setup everything. So my dependencies are already installed, and it goes fast every time I don’t need to update them.

Shouldn't this just be

Application with a dockerfile like this

    ADD Gemfile
    RUN bundle install

    ADD .

Check out repo.

Run docker build.

That way it's all cached, and you don't repeatedly build the dependencies. Has the author not seen docker caching?

[+] drothlis|11 years ago|reply

> If I want to keep my containers separated, I can’t have them communicate with a UNIX socket, unless I create a shared volume. Once again it’s a no go for me.

What do you mean by "shared volume" -- docker's "-v" to bind-mount the socket into the container? How else would you like to expose the socket to the container? Why is it a no go?

[+] vidarh|11 years ago|reply

I don't get that one either. It seems like he objects to it on the basis that it reduces isolation, but so will exposing ports via TCP/IP, so I don't get what he thinks he gains by avoiding volumes in situations like that. Especially since you can easily enough share just the socket file and nothing else.

[+] Rafert|11 years ago|reply

> The minimal Ubuntu 12.02 image is 627MB. Add your own application layer, and your container will most likely weight more than 1GB.

I've read that using Debian (or if possible Busybox) helps in this area. See http://container-solutions.com/2014/11/6-dockerfile-tips-off... and http://jonathan.bergknoff.com/journal/building-good-docker-i... for tips.

> Every time your application is updated, check the Gemfile md5 and see if it’s different from the latest build

Wouldn't you need to check Gemfile.lock? With optimistic/pessimistic version constraints you could get newer gems without updating the Gemfile.

[+] calineczka|11 years ago|reply

> Wouldn't you need to check Gemfile.lock?

Exactly. It should be Gemfile.lock that is checked, not Gemfile

[+] ptype|11 years ago|reply

This has been submitted previously. Seems like a ? has been added to the end of the URL which mitigates this being flagged as a duplicate. See https://news.ycombinator.com/item?id=8408291

[+] Xylakant|11 years ago|reply

Adding a query parameter is the accepted way if you want to resubmit a link. It also seems to be generally accepted to resubmit if the original submission discussion did not bring a fruitful discussion. Articles sometimes drop off the homepage very fast even tough they're interesting for multiple reasons, for example if they're submitted at US night time or if some other, controversial topic dominates the day. I saw the original submission and I like the new submission here.

[+] kimi|11 years ago|reply

Two weeks? we have been running Docker in production for over one year and have thousands of separate instances. No big issues os far. All that is said is true, it's just a matter of weighting pro's and con's.

[+] ubersol|11 years ago|reply

I would be really curious to see how you are actually managing these thousands of separate instances? Do you have some kind of a management GUI to do this? Are you managing them through command line? If so are you running into issues where managing these take up most of your time and adding more complexity into your environment? I honestly want to see a good example of how you manage your docker containers where you have thousands of containers instead of say 10.

[+] 72deluxe|11 years ago|reply

I was surprised to see complaints about building libsqlite3 and Image Magick, presenting them as tricky. It isn't is it? This is a Linux guy, right?

[+] twic|11 years ago|reply

The complaint was specifically about building a fully static version, i think. I have no idea how to do that, but i imagine it's not as simple as ./configure && make.

70 comments