top | item 12880917

Docker in Production: A retort

133 points| crymer11 | 9 years ago |patrobinson.github.io | reply

120 comments

order
[+] carapace|9 years ago|reply
I've never used Docker, or containers, but I read about things like "Breaking changes and regressions ... a well documented problem with Docker" and "Can’t clean old images ... a well known issue" and it just seems to me like a crazy thing to try to use and depend on this thing/company. Bluntly put they seem like children.

So nevermind a retort, what I would like to see is a sane, sensible "business value" cost/benefit, pros v. cons breakdown of just what the heck you're actually gaining (and losing) using Docker vs. some other architecture/methodology. Because absent that it's all just hype and kool-aid drinking in my opinion.

What would help with the above is if people would document what they are doing with Docker that works, because either they are hurting but not realizing it, or the author of the article is just "doing it worng" and whining about it in public. What is really going on with Docker, et. al.!?

[+] emeraldd|9 years ago|reply
I can tell you one place I've found them very valuable: development systems. The longer I've been in the field, the more I've learned that polluting my dev box with globally installed tools makes life painful down the road. Docker provides a nice way (especially now that they have docker exec) to spin up environments, or set's of environments with tools/stacks and not touch the host. It's lighter than a bunch of vm instances and easier to orchestrate on a small scale than most tools I've found. Docker for Mac makes that even nicer!
[+] jsz0|9 years ago|reply
> seems to me like a crazy thing to try to use and depend on this thing/company

Always beware of perception bias or whatever the right term is for this. When docker was new it got tons of hype and people were talking about how great it was. Now it's no longer the hot new thing so it's just another tool many people are using quietly and successfully. It no longer warrants much positive discussion. The people using docker in production successfully don't feel compelled to write articles about it working for them.

I'd say the pros/cons of docker (or containers in general) are similar to the pros/cons of buying hand made vs. mass produced goods. The hand made goods will probably be higher quality but they take more time and skill to produce so they're more expensive. The mass produced goods will probably be lower quality but will work fine and at least offer consistent/reliable level of quality. They take less time and skill to produce so they'll also be less expensive. In the context of using Docker in production this translates to it being very good for solving 'off the shelf' types of problems quickly and reliably. It's essentially the software/systems version of an assembly line. If you don't need an assembly line then docker may be worthless or even a total waste of time. If you do need an assembly line it's the best option currently.

[+] ben_jones|9 years ago|reply
As someone considering using Docker for production infrastructure I'd love to hear from a docker expert what their thoughts are on the following observations:

* Docker encourages fully disposable infrastructure

* Docker containers can be more secure then traditional environments

* Docker provides for indempotent environments agnostic of hardware concerns etc

* Container management software (Kubernetes, etc.) makes Docker much more powerful/useful and is only going to get better

Specifically, I'm curious if my observations are correct, and if they are why they are correct and why are they SO much better (vs current docker alternatives including non-container based approaches).

[+] empath75|9 years ago|reply
I think old school sysadmins care way more about system stability and new school devops guys are focused on cost and the speed of the development cycle. It really doesn't matter if you architect your systems properly that docker is unstable or your services crash as long as you have effective monitoring and self healing systems.
[+] shykes|9 years ago|reply
Docker founder here.

I keep reading articles stating that "the Docker API changes with every release", but the assertion is never backed by any specific examples. Has anyone here encountered an actual breaking change? If so, I would appreciate you sharing the specifics so we can fix it.

Docker is by no means perfect:

- I remember that in 1.10 the switch to content-addressed registries meant that older clients could not pull by digest (but all other commands, and even non-pinned pull, still worked). This was not an accidental breaking change: it was the result of a difficult tradeoff. In the end we decided that the benefits of the new content-addressed model outweighed the inconvenience. To guide our decision we used data from Docker Hub to assess how many clients would be affected. I forget the exact number but it was a very small minority.

- And in 1.12 we got bitten by a change in how Go 1.6 processes HTTP headers (it became more strict and thus rejected headers from older clients). That was quite simply a screwup on our part.

So we've had our share of screw-ups, no question. But lately I've been reading the "breaks at every release" meme more and more. Based on the evidence I have, it seems incredibly disconnected from reality.

What am I missing?

[+] web007|9 years ago|reply
You're missing the fact that docker clients complain about the API version if they're different from the server - regardless of actual compatibility problems.

http://stackoverflow.com/questions/37617400/ sums it up. There's a magic (afaik undocumented) env var DOCKER_APIVERSION you can set for compatibility, but nobody can find it.

If you can't mix even minor versions then yes, it's a problem. I can't run X in prod and anything other than X in dev of I'm working with the same toolset.

[+] teabee89|9 years ago|reply
Maintainer here,

After verifying, I have to rectify your second point: 1.12 DOES NOT BREAK older clients.

To add more information, the issue was present for Docker clients with version 1.9 and older talking to THEN Docker master. Thankfully it was seen before 1.11 was released, and thus we decided to stay on Go 1.5 for Docker 1.11 for this reason.

For 1.12, we added a workaround for this issue, allowing us to upgrade to Go 1.6.

UPDATE: Docker 1.10 was built with Go 1.5 so this issue was not present with a Docker 1.9 client talking to a 1.10 daemon either.

[+] OhSoHumble|9 years ago|reply
Oh, hai there. I haven't gone 'whole hog' on Docker yet and subsequently tried to push for rolling it out in production for my team.

I don't know how incorrect I am, but it seems like the Docker project is more interested in features rather than stability. Except, like you said, I don't have any specific examples to back that up. Maybe it's just community hive mind thought bleeding onto my decision making abilities?

The most concrete failing I've experienced was the deprecation of the boot2docker project in favor of docker machine. When I started using docker machine, stuff just... broke... and it left a bad taste in my mouth. In fact, using docker on either a mac or windows is just awful for me. My anecdotal experience is that it breaks in small ways that then that can really disrupt my productivity. I'm in the process of picking up docker again and it led me to install Arch on my work laptop just so I can have native containers.

Right now my use case is using containers to test Chef cookbooks and it works... alright. Not as well as virtual machines, but alright. How docker handles init bothers me a little bit but it's nothing the maintainers of the testing suites for Chef couldn't handle through yaml configuration options.

Oh, I guess while you're here... one thing I've been wrestling over is the usage of dockerfiles. I have no problem with containers but to me, it looks like using a configuration management system to bake container images seems to grant so much more flexibility and testing capabilities that I can't really see dockerfiles as anything other than an intermediate step. Is this a valid viewpoint?

[+] gerdesj|9 years ago|reply
Thanks for stopping by. I too would like to see something more than a "my toys broke - mwaaaah".

I personally find Docker a phenomenal resource for development. For production however, you analyze your use case, the risks, the failure modes etc etc and then you decide whether Docker is a part of that.

I'm by no means a massive Docker user but compared to many of the sub systems running on my laptop it has been very stable for me although I do run Gentoo. Actually my entire laptop is surprisingly stable TBH. I absolutely love the way I run a few short commands and a fully tooled up web app with all the supporting infrastructure simply appears.

Funnily enough, only yesterday I had a look at this: https://nextcloud.com/collaboraonline/ which I found in an article somewhere (LWN?) Yep - love it (NC) so the next step will be to decide how to use it for my home production system (yes I have dev/production at home) For that I will use pretty much the same criteria that I will use for my office and finally my customer systems. In this case Docker for production probably makes sense but I need to analyze it a lot more. A CoreOS VM will probably be spun up very soon in my attic for stage 2 in home production.

I really don't like to see rants along the lines of the other article, denigrating a product that a lot of people have worked hard on, that shows a lack of discipline and clarity of thought. I for one would like to thank you for your work and making it available for all. It has genuinely saved me time and effort and made my IT life a little bit better.

To answer your final question: a thicker skin, mate 8)

Cheers Jon

[+] DrNemski|9 years ago|reply
Hi There,

I created an account here so I could let you know I've clarified some points in my post to address your concerns.

[+] cyphar|9 years ago|reply
> Has anyone here encountered an actual breaking change?

The fact that if your local client is newer than the daemon, the client will fail to do anything is pretty annoying. You can hack around it, but that's not a solution to the problem.

Also, while you shake off the 1.10 change as though it wasn't a huge deal, the migration tool that was written was not fit for purpose. First of all, it sometimes caused corruption after the migration (fixed after release but still an issue, and you couldn't practically un-bork the corruption). There were other issues with the tool, but the biggest issue was the there wasn't any sane transition period -- which made packaging very difficult. Because Docker couldn't handle restarting the daemon and keeping the containers around (and after restarting a post-1.10 daemon you'd have to wait for the full migration to finish before you could even interact with the daemon -- another issue with the transition), you'd have to publish a fake version bump of the Docker package to even alert users to the issue!

A more sane way to handle it would've been how databases handle the issue. You lazily do the conversion (or you have your software be able to handle both formats so you can run your migration while still having your containers work). And yes, I'm aware that it'd be difficult to implement but the transition path implemented was really bad.

I spent several weeks of my life (that I'll never get back) trying to make the migration path work. It didn't end up working because it wasn't designed to work nicely from the outset. So I ended up just telling users of our Docker packages about the issue and packaging the migration tool separately.

Another issue that I've had recently is that Docker is making it increasingly difficult to use `runc` by another name. In openSUSE we package a system runC separately to Docker (and it lives in /usr/sbin/runc), but because of the fact that in 1.12 --runtime was intentionally broken (and containerd's defaults are no longer obeyed by Docker) now you can't effectively use --exec-opt native.cgroupdriver=systemd. Great. So now we're going to have to package docker-runc and docker-containerd separately.

Don't get me started on the containerd split. I was not very happy the day that PR was merged, because it caused a bunch of headaches for us. It's also a bit amusing to note that containerd was tagged as "pre-alpha" until Docker bundled it (then it was considered stable).

That all being said, Docker is a great project -- it's just that you have to admit that issues with Docker are not "incredibly disconnected from reality".

[+] Johnny555|9 years ago|reply
This seems less of a "retort" and more of a validation that most of the issues brought up in the original article are valid complaints.
[+] nickthemagicman|9 years ago|reply
I love how the major issue, that both this article, and the original article warn about is: don't use docker on 'CORE APPS'....

That says all you need to know about the trustworthiness of Docker.

EVEN DOCKER PROPONENTS caution against using it in 'important' apps....

What apps are people investing time in that aren't 'important'?

Is there a coffee machine that is ok to use for a docker app somewhere?

[+] user5994461|9 years ago|reply
> Is there a coffee machine that is ok to use for a docker app somewhere?

Most coffee machines are docker ready.

To guarantee you the best experience, you will need to setup a pair of coffee machines, plus an orchestration system that will be responsible for swapping them automatically when one ran out of coffee.

Note: There are only prototype of orchestration systems. Nothing for sale in the corner shop yet.

---

More seriously...

Not important: Most internal, development, and test systems

Somewhat important: Web applications, various support micro services. (They all are stateless, with multiple instances, and reactive failover by their respective load balancers).

Critical: Most databases (especially the ones without multi-master mode and automatic failover), trading applications, payment systems, accounting systems, databases with money $$$

[+] sheeshkebab|9 years ago|reply
AWS was also considered not a fit for running "core" apps and databases... look where we are now.

Docker is certainly having some teething issues - and things like kubernetes are certainly WIP. But I wouldn't be as skeptical - plenty of companies use it in production already, for "core" apps.

[+] flurdy|9 years ago|reply
I interpret that as don't use X (in this case Docker) with your most essential services. Initially. Get your teeth cut, make some mistakes, establish routines, etc with your internal apps, your low traffic services, etc. Be patient, then once comfortable and confident your start to roll it out to your core, heavily integrated services in production.

Having just come off a year at large(very large) government department that have 100s of core services running in Docker in production, I think the tech is ready, but not sure if every company is ready.

[+] CSDude|9 years ago|reply
> Again, well accepted principle that “thou shalt not run a database inside a container”. Don’t do it, end of story.

Sorry, but this is really a bad advice. We have ran and contine to run various databases inside Docker including MySQL, PostgRedis, Cassandra, Elastic Search, RethinkDB even HDFS with proper user rights and configuration. We can maintain the state just as fine. If your only problem is to move the data, all you have to do is stop, export, tar it, move to another server, just as you would do in a normal server. Docker is not a magic bullet to solve such kind of issues. Yes, Docker might have another problems, but just as you could not run someting with state inside Docker does not mean "thou shalt not run" , there are various ways to manage state. Host, IO can get crash regardless of Docker.

[+] otterley|9 years ago|reply
What problem does containers solve for you for this particular part of your infrastructure? Native storage software packages are available for mainstream OSes that handle dependencies via the native package manager. And since the storage they manage is usually directly attached, nodes that run this software are infrequently migrated. And this software is infrequently upgraded under the maxim "if it ain't broke, don't fix it." It's there to store data on behalf of the applications you write; it is not a thing to upgrade or migrate for its own sake unless there is a bug to quash or a new feature that your application will depend upon; and even then, migrations must be carefully planned to preserve availability for users.

Docker and the like seem like a solution in search of a problem for this particular part of a typical service infrastructure.

Or, to put it more bluntly, just because you've gotten away with it (thus far) doesn't make it a good idea.

[+] lobster_johnson|9 years ago|reply
Re ECR (the EC2 Container Registry), it has one downside that the author doesn't mention, which also applies to Google's own registry.

A Docker registry has own authentication system. So does AWS (and GCloud). So what you end up is one wrapping the other: To access the ECR, you have to run an AWS command to get a token to put into "docker login". Google has "gcloud docker login" for the same purposes. Both produce temporary credentials that time out, so can't be used for long-running things.

This means that any tool designed to work with a Docker registry needs to support this particular workflow. For example, this affects Drone [1].

It also adds complexity. GCloud is particularly heavy on the authentication complexity side already (compared to AWS's comparatively simple keypair approach), and with SSH, GCR and Kubernetes on top it starts to stack up in ways which can make users' head spin.

Straight Docker Hub is refreshingly straightforward by comparison.

[1] http://readme.drone.io (not to be confused with drone.io)

[+] jwatte|9 years ago|reply
Previous article: "This is nowhere near ready for those who just want to get the job done."

This article: "It'll be better in the future, you'll see!"

The former is verifiable, the latter is a hypothesis.

[+] DrNemski|9 years ago|reply
No what I was saying was its ready for today in certain use cases. But only if you look at it as a long term transition and not a short term project
[+] girvo|9 years ago|reply
As an example of Docker in production: Expedia are moving lots of their legacy infrastructure into Docker containers. My third-party contracting team that works on projects for Expedia (we're brought in so the rules and bureaucracy don't apply to use, allowing us to rapidly iterate and experiment in ways the core teams can't) have been using Docker end-to-end (local development through to autoscaled production deploys)

While there were teething issues, this article does a good job of pointing out the flaws in the original article, I think. It's been easier to get our team up to speed on Docker and it's gotchas than nearly any other configuration management, server management, et al. systems that we tried!

[+] user5994461|9 years ago|reply
Short version: Nope :p

Long version: I met the DevOps guy who [I believe] is responsible for pushing Docker at Expedia and we've had long conversations about it.

They were lucky to have had a particular environment and a specific version that worked, and got it pinned down and frozen very early.

I suppose you are on the dev side and not aware of all that. (Hell, maybe, you're not even in the same subsidiary of Expedia). I'm glad it all worked out for you as a dev, my devs are also happy with Docker. (We're probably used as an example of Docker success story at times).

In the end, there is no free lunch. There is dirty work done and more to be done. Some of which is invisible.

[+] conradk|9 years ago|reply
Can anyone comment on how rkt compares to Docker regarding the issues from this article ? And how does rkt compare to Docker in production in your experience ?

I've been using Docker in production for a single server website and have had very few issues. I do like how easy it is to reproduce a working environnement with a "docker build" though.

That being said, I think that just using Ansible on a server is probably an easier and more reliable solution. Ansible is battle tested and allows to have reproducible environments too.

[+] DigitalJack|9 years ago|reply
Is ansible still python 2.7 only?
[+] wickedlogic|9 years ago|reply
Related question, what happens when a docker images gets pop'd.... how do you keep it around for investigation, does it get imaged for later forensics? Every time I have asked people in IRL doing docker, they seem to focus on updating/patch... on how easy that is and moving on... but that is not always an option for every client. Do you just image all docker images before they get terminated/migrated?
[+] justincormack|9 years ago|reply
All the containers are available after termination, yes, so you can investigate.
[+] smegel|9 years ago|reply
> So the point is valid, but there are some big names invested in solving it, so I’m optimistic we’ll see some stability in the future

And it will still be valid if someone forks Docker. In fact, that would validate the criticism.

[+] pfarnsworth|9 years ago|reply
Are these breaking changes problems caused by Docker itself? I was contacted by Docker and was considering applying, but it sounds like their engineering management doesn't know what they're doing. Is this depiction accurate or is it overblown?
[+] shykes|9 years ago|reply
I would recommend investigating the matter yourself, and making your own opinion. What specifically has been broken in past Docker releases? How have they handled it? How would you have handled it? If you do decide to interview with Docker, make sure to bring up your findings, especially areas where you think they screwed up. This has two advantages: you can see how they react to constructive criticism, and they can observe that you are capable of making your own opinion and providing constructive criticism.

If you really want to impress your interviewers, back up your criticism with a pull requests fixing the issue. That will automatically put you on top of the pile of resumes.

[+] pmarreck|9 years ago|reply
Anyone know why Erlang doesn't run well on containerized Docker?
[+] user5994461|9 years ago|reply
Hello. It was a long time ago and I didn't get all the details.

I think it either came down to a broken version of docker at the time that couldn't run properly.

Or it was because of unstability from Docker. All our Erlang apps are meant to take thousands of requests per second and Erlang have particular ideas about how the system/network should behave. It's very hard on Docker and any unstability or incompatibly would make up for outstanding failures.

Either way, we know for sure now that there were many major critical issues with Docker at the time of the Erlang trial. Not adopting it was the right decision, and it still is.

We only use Erlang for critical applications. We really can't afford to push the limits after a few red flags.

[+] jwatte|9 years ago|reply
I have used Erlang and containers, but not in combination. I know that Erlang (really, beam) has some ideas about what the network should behave like, and highly virtualized/"software defined " networking may confuse it.
[+] justincormack|9 years ago|reply
What issues are you having? Containers are not really different from non containerised environments. Have you filed an issue?
[+] lobster_johnson|9 years ago|reply
It runs fine. However, the challenges are the same as any other environment (e.g. AWS): Remember to set a Erlang cookie, and make sure the host name is sane/set and resolvable, otherwise you will run into issues with the portmapper (epmd).
[+] bitwalker|9 years ago|reply
I run several erlang/elixir applications in containers, it works just fine. I'm not sure why someone claimed it doesn't, but I suspect they didn't try very hard if they did have trouble.
[+] empath75|9 years ago|reply
Why not use an erlang unikernal instead?
[+] ledil|9 years ago|reply
if I am using mount Volumens to export my data, can I bypass the aufs/overlay implementation/logic ? do I need to pay attention only if I don't mount the volumes? thx
[+] justincormack|9 years ago|reply
Volume mounts do not use the aufs/pverlay drivers, no. These are used for build, and for constructing the rott filesystem for running containers so files are shared.
[+] corv|9 years ago|reply
Docker seems very limited when it's unsuitable to run databases.

I've never seen this limitation with other container solutions. What is it about Docker that makes it problematic?

[+] lobster_johnson|9 years ago|reply
Nothing. It's bad advice. Lots of companies run databases in containers.
[+] SoreGums|9 years ago|reply
Most likely lack of planning and applying a it should "just work" process to the scenario at hand...

The thing I've learnt with Docker is that all the other Prod issues still exist. Docker at its core only solves the executable part or distribution of the process to run. Still need to figure out, network, storage, monitoring, backups, discovery, etc...

What starts out as a single host can quickly quadruple once all the other considerations are taken into account and wanting a scalable, reliable and available system.

[+] cuillevel3|9 years ago|reply
Good retort. The original article seemed clueless, the part about aufs was just wrong, the complains about the apt repo exaggerated. Running docker on Debian ancient is kind of brave, though. And software is finished after five years, maybe in the financial industry. Currently development has such a pace, I'd say after five years it's abandoned and replaced.
[+] yawz|9 years ago|reply
"The internet has been a wash with a well written article about ..."

Typo: a wash => awash

I know! The content is more important than the quality of the writing, but it's a little surprising to see such a mistake jumping at the reader at the start of an article. We should go back to the first days of the Internet where "updates" were possible. :) I would have loved to suggest an update quickly.

[+] DrNemski|9 years ago|reply
My googling failed on this account on the correct grammatical term.