top | item 34225669

Ask HN: A Better Docker Compose?

74 points| Randomdevops | 3 years ago

In docker compose you have a flat list of services and you manually weave them together with configuration. Then in an effort to secure things you add on frontend/backend networks to isolate containers from each other.

   services:
     proxy:
       build: ./proxy
       networks:
         - frontend   
     app:
       build: ./app
       networks:
         - frontend
         - backend   
     db:
       image: mysql
       networks:
         - backend
      
You add config to share credentials between services.

   services:
     app:
       build: ./app
       environment:
         DB_PASSWORD_FILE: /run/secrets/db_root_password
       secrets:
         - db_root_password
     db:
       image: mysql
       environment:
         MYSQL_ROOT_PASSWORD_FILE: /run/secrets/db_root_password
       secrets:
         - db_root_password
         
   secrets:
     db_root_password:
       file: ./secrets/db_root_password.txt
    
    
Is there a way to abstract away these extra configuration steps and keep it simple yet secure by default?

If you would express db as a dependency/resource of app, could you infer that you could put it in a seperate network and have the credentials automatically link? 'As a developer' I'm not really interested in the network specifics or which credentials, I just want them to talk securely and minimize any attack vectors and keep any configuration to a minimum. With tens of apps, their databases and transversal connections, how to do you keep the configuration to a minimum?

Googling around I found;

humanitec: https://docs.humanitec.com/integrations/resource-types/mariadb They express something as 'resources', a dependency type that can be provisioned automatically with inputs and outputs that then can be injected in the application env vars: mysql://${externals.my-db.username}:${externals.my-db.password}@${externals.my-db.host}:${externals.my-db.port}/${externals.my-db.name}

you're limited to a limited set of drivers ofcourse and how would you express an app1 to app2 dependency?

juju: https://juju.is/

Each app is packaged in a charm which seems to be a yaml declaring inputs, dependencies and other meta data and optional python code that can respond to certain lifecycle hooks

https://discourse.charmhub.io/t/implementing-relations/1051

   name: my-node-app
   ...
   requires:
     database:
       interface: mongodb
   provides:
     website:
       interface: http
    
Things can seemingly be autowired based on what interface they provide and require? So just make a list of apps until everything resolves?

Does anyone have experience with these tools or others like it?

How do you stop others and yourself from drowning in credentials, certificates, env vars, jvm params and k8s yaml(times every environment) How do you do (not the configuration management) but manage an inventory of what configuration is needed to run your environment (or a subset for ci)?

73 comments

order

achempion|3 years ago

I find your example simple and easy to understand and I don't see any issues with defining everything explicitly.

I'm using docker swarm/compose extensively for multiple years and configuration can be a bit verbose sometimes but it's very simple system to maintain and write config for.

I think, splitting relatively small configs into smaller bits and using additional abstraction can actually increase complexity of the system compared to having everything in a single file as it's basically write and forget and not something you modify heavily once system is mature.

Randomdevops|3 years ago

Well they are examples :)

But imagine the complete deployment looks like 2 monoliths, a dozen supporting webservices, 3 databases, redis, an elasticsearch cluster, ActiveMQ and a mailserver.

Then for a development track you only want to run monolith A (cost/resources/startup time) and the mailserver being MailHog so you don't accidentally send something to real addresses.

But you can't just split the 2 monoliths in separate compose files, as they share services/dbs. If you did that you would have to wire them in manually.

For local development you might only want to run webservice B (which has a db and redis as dependencies)

And indeed if you try to manage several files that need to be combined in a specific way things gets messy...

So then things seems to boil down to starting every service separately and relying on service-discovery, but then the lifecycle isn't tied together anymore. So if I want to tear them all down, how would I know which ones I spun up in this imaginary context?

Say I want to spin up an on demand integration environment for an application, what else do I start too?

mooreed|3 years ago

At the risk of sounding clichè/unhelpful.

> “you can solve every problem by adding or removing an abstraction layer”

Have you considered generating your YAML/JSON config with something that composes?

If you are open to it you might be interested in dhall [1] as it’s a config language with variables, functions and imports.

I have used it for pet projects and I could see how it could offer some tidy encapsulation patterns for larger, more complicated production applications.

[1]: https://dhall-lang.org/

satvikpendem|3 years ago

Cliché is spelled with é, rather than è, which makes the word sound more like cleesh, not cleeshay.

oneplane|3 years ago

Docker compose is a dead end AFAIK, it's not deprecated or anything, but the community power behind it has essentially moved to other things (Kubernetes, but also wasmer, nomad, skaffold).

While Docker Desktop / Podman / Rancher Desktop combined with stuff like Skaffold aren't exactly a drop-in replacement for docker-compose, it does do a much better job at bringing up and tearing down entire compositions while re-using existing packaging and access controls.

If you are running docker-compose for non-development things, it might be a different story; it might be suitable for non-GitOps things, but as posted elsewhere, at that point you're better off using something like systemd.

When composing really small setups I either do this with a shell script (think 10 lines including trapping exits) or a systemd unit. Whenever it needs to be bigger I nearly always end up with an actual scheduler (K8S, Nomad) and GitOps because you can't really deliver something maintainable, available and durable anymore without it (well... I suppose if you have only 1 project to deliver, forever, you could manually manage it).

It does get a whole lot easier when you have a common foundation you can re-use. Spinning up an entire stack with security, monitoring, alerting, automatic rollouts/rollbacks for even the smallest project is just single-digit minutes work now.

Pulling in some other factors: how sharable/collaboratable is something these days if it is not built on similar enough technologies and modules? A solo yolo project might not care much about this, but when was the last time someone asked for software that is risky and not durable?

theteapot|3 years ago

> Docker compose is a dead end AFAIK ..

What? I'm not involved and don't follow closely but pretty sure it's about as dead as docker itself. I.e. not dead. There was commits 8hrs ago -- https://github.com/docker/compose/. Not sure who did that if not "the community".

smcleod|3 years ago

I really don't think this to be the case, they're for different purposes. docker-compose is simple and useful for running things on a single machine, e.g. in a VM, a home server, local machine etc..., Kubernetes is for scheduling, deploying and scaling things on many servers, cloud platforms etc.... Both work with containers - but in different situations for different purposes.

solarkraft|3 years ago

It's polarizing to call it dead, I'd say it's very much not, but it's fairly clear that the Docker people would rather have you using their follow-up project Swarm (which I think is supposed to be something like Kubernetes lite?). A major example of this is the ability to define resource limits being removed from docker compose V3 files unless you run it in swarm mode (which is mostly compatible).

KronisLV|3 years ago

> Docker compose is a dead end AFAIK, it's not deprecated or anything, but the community power behind it has essentially moved to other things (Kubernetes, but also wasmer, nomad, skaffold).

> While Docker Desktop / Podman / Rancher Desktop combined with stuff like Skaffold aren't exactly a drop-in replacement for docker-compose, it does do a much better job at bringing up and tearing down entire compositions while re-using existing packaging and access controls.

I don't know about this: in my experience the mentioned alternatives are generally way more complex and cumbersome to initially set up, for smaller projects.

I've seen successful projects use Docker Compose locally for all of the dependencies that need to run in containers (sometimes the apps run locally, for easier debugging, other times they also run in containers for no need to setup a runtime). I've even seen Docker Compose be used in production for projects that don't need to scale that much - just one server in which these containers run not that dissimilarly from what you might get with systemd services on a single node.

And when the need to scale out finally arises, then something like Docker Swarm (which uses the Compose specification) has been enough in the majority of cases, when you don't need anything too fancy (e.g. network partitioning, a la Istio/Kiali). It's extremely simple to setup and works really well, especially with something like Portainer. In addition, its integration with Ansible is also pretty great - though I'd advise Ansible in most cases in general, when you manage the underlying servers.

And I know that many claim that Swarm is dead - though in my eyes it's also basically feature complete and still gets new releases and bug fixes alongside Docker (which others claim is also dead); a bit of caution and having migration paths (e.g. Kompose) doesn't hurt, of course. When you do need (or desire to) run something more "current", then Nomad becomes sufficient on the servers, but I haven't had the need to run it locally, ever - since your environment descriptions will still be reasonably simple, you can just use Docker Compose locally and the HCL equivalent on servers, provided that you have dev/test/staging environments before prod to test everything out on, even if this is a drawback.

If you do decide to go with Kubernetes, then in my experience k0s, K3s (developed by Rancher) or even MicroK8s are all excellent starting points, but at that point you'll probably also want Helm charts and will essentially need something like Skaffold locally. When you get to that point, however, I've found that dealing with the complexity of Helm charts and running Kubernetes locally has more overhead than just using Swarm.

> It does get a whole lot easier when you have a common foundation you can re-use. Spinning up an entire stack with security, monitoring, alerting, automatic rollouts/rollbacks for even the smallest project is just single-digit minutes work now.

This is an excellent point, however! Maybe the DevOps cultures I've seen haven't been strong enough, though, since this has never really been the case in my experience and each service had a bit of custom work to be done for these aspects.

> Pulling in some other factors: how sharable/collaboratable is something these days if it is not built on similar enough technologies and modules? A solo yolo project might not care much about this, but when was the last time someone asked for software that is risky and not durable?

Here's a counterpoint - while many out there might be in the sweet spot where they can just pay someone else to give them a managed Kubernetes cluster, what about those who can't? What about those who decide to use Kubernetes, but have to host and upgrade their own clusters? It can easily become less durable and more risky, especially if your DevOps/Ops people aren't familiar with all of the nuances of Kubernetes.

Right tool for the job and all. Sometimes it's a great choice, other times it's risky.

stokedbits|3 years ago

Personally after using docker compose since it came out I’m excited to see the evolution of local Kubernetes development. I’d scrap even dealing with trying to make something as limited as docker compose do what you want. I’d focus on moving towards local Kubernetes development.

This will bring you closer to the deployment stack if you are deploying to Kubernetes. Then also let you leverage tools like kustomize to dry out your configurations.

There are some great projects like tilt, devspace, skaffold, etc that help facilitate deving on a local or remote cluster.

As far as configuration management that can be as simple as cascading kustomize configs or helm. Then leveraging something like vault. The point really is, if you start with Kubernetes you have way more flexibility with tooling and options to do whatever the heck you want.

Shameless plug I recently started a series on local Kubernetes development. It covers some of this with tilt and more. If you would like a specific thing covered here I can add an installment to it. https://youtube.com/watch?v=Nj55RDVwrIE&si=EnSIkaIECMiOmarE

ecuaflo|3 years ago

Thank you! Will you go on to teach infrastructure provisioning/IaC, CI/CD, live schema migrations, rollbacks, etc on a cloud provider with this stack? I would find the complete devops pipeline so helpful

ditsuke|3 years ago

This might end up being my gateway to learning and using k8s. Thank you!

na4ma4|3 years ago

You can use environment variables with error messages

https://docs.docker.com/compose/environment-variables/

   - `${VARIABLE:?err}` exits with an error message containing err if `VARIABLE` is unset or empty in the environment.
   - `${VARIABLE?err}` exits with an error message containing err if `VARIABLE` is unset in the environment.

klntsky|3 years ago

Arion is a Nix wrapper tool for docker-compose. Nix is handy for use as a template language. It is actually closer to a general-purpose language than to a template engine, so I guess what you want can be implemented manually, since it only requires config file generation.

Arion docs: https://docs.hercules-ci.com/arion/

Nix language: https://nixos.wiki/wiki/Nix_Expression_Language

pxc|3 years ago

In Arion, dependencies between services are still implicit, as at the end of the day what you're doing is generating a docker-compose.yml file.

There are other tools that model things more like the OP is thinking, though! With Disnix, you have an explicit service model where you declare those dependencies.

See: https://hydra.nixos.org/build/203347995/download/2/manual/ or maybe Sander's blog posts on it

fswd|3 years ago

I use a monorepo (turbo) that has a directory: ./app/<docker workspace or app name>. In it I have a package.json that runs docker-compose up. I use regular .env files and keep a .env.sample in the repo. The docker-compose brings up the services and volume mounts the .env into it's local environment (same with /data for example postgres, or some other package).

I then disable iptables in /etc/docker/daemon.json: "iptables" : false to prevent docker from poking a hole in my firewall. (This should be on by default..) I also enable User name spaces with this: "userns-remap" : "default" so that the docker image runs isolated. I can disable this in my docker-compose with userns=host Actual instructions are slightly different, but: https://docs.docker.com/engine/security/userns-remap/

I also found I could disable networking all together on some services, and just expose a port. Some services don't need to connect to the net.

Then I run a cloudflare argo tunnel to the application. But you could also use a nginx reverse proxy, some other tunnel, or a one page express.js reverse proxy with pm2. There's a cloudflare argo tunnel docker-compose setup which allows you to type in your tunnel information in the config. I thought that was cool but I did not try it.

ecuaflo|3 years ago

I’ve been spending a week trying to learn how to deploy a collection of containers (my web app, a Postgres DB, and some microservices) to AWS and I am still so lost.

The first solution I happened upon was serverless. Specifically SST, which is written with AWS CDK, but you must develop on live services and I just can’t justify paying to develop.

Then I found Serverless Framework, which is an abstraction on CloudFormation, but the offline solutions like localstack get a lot of flack for being buggy and localstack charges for some services. I also looked into Architect but the documentation is abysmal.

Then I figured serverful might be the easier way to go. I found that docker compose has a built in integration with AWS ECS where it transforms your yaml into Cloudformation to provision the right services. However, it seems to just be missing key parts like custom domain and SSL certificate provisioning which seems to defeat the IaC ethos.

Then I figured I might go with Terraform and I found some seemingly good starters like https://github.com/aws-ia/terraform-aws-ecs-blueprints https://github.com/cloudposse/terraform-aws-ecs-web-app https://github.com/turnerlabs/terraform-ecs-fargate but the examples are just lacking. They don’t have any examples for multiple containers that can access each others’ resources that I can find. Reading these templates has at least given me a better idea of the resources I need to provision in AWS but the networking and configuration still frighten me. Like do I need to configure nginx with a reverse proxy myself? How do I orchestrate that container with the others? And apparently services can crash and just not restart? And I need to make sure to configure volumes for data that needs to persist. And setting up the CI/CD seems daunting.

I’ve also heard about docker swarm, kubernetes, pulumi, AWS SAM, etc but it’s a lot to learn. When I go on Discords for web frameworks, mostly everyone including the devs of these frameworks use 2nd tier managed providers like Vercel, Fly, Netlify, Supabase, Cloudflare, etc. But many of those are just not as reliable as core cloud providers, the cost is way higher, and now you’re setting up a local stack that probably wildly differs from how it will work in production between those services. Glad to see I’m not alone in a very reasonable expectation of a simple way to orchestrate multiple containers on AWS, what must be the most common use case web developers have

latchkey|3 years ago

I feel like you've gone down the rabbit hole to the point that you've over complicated everything and now you can't do anything.

Take a step back and simplify everything.

Personally, I just use GCP cloud functions talking to a GCP managed postgres database. I don't worry about all the deploy stuff since their tooling takes care of it for me... heck, integrate it with github actions and it is just a git push to go live.

GCP is also next to free for low tier usage. Last I looked, a tiny postgres instance was one of the cheapest out there.

Then, just put cloudflare in front of GCP and now you've got the ability to use their functions in front of GCP too.

Fully scalable, relatively simple and low cost.

brundolf|3 years ago

> But many of those are just not as reliable as core cloud providers

Aren't they? I've never heard this, do you have a source?

> the cost is way higher

Depends. For small-medium stuff the cost may be "way higher" in that it's dollars instead of pennies per month, but you have to weigh that cost against the cost of your time (how many months of PaaS hosting did the past week cost you in dev hours?). I'm sure the economics become significant at some scale, but I'd ask yourself whether they are at your scale

> and now you’re setting up a local stack that probably wildly differs from how it will work in production between those services

That doesn't have to be true at all; many of these services let you ship Docker containers if you want to (and manage everything else from there), but it's possible to go without even that: https://news.ycombinator.com/item?id=34243045

I'd recommend taking a step back and seeing how much complexity is really necessary for what you're trying to do

klooney|3 years ago

If you want something that's free locally, but has a pretty good port-to-cloud story, you want kubernetes.

Getting live with EKS in AWS is a little scary and painful and not cheap, but if you just work through https://www.eksworkshop.com/ it's pretty plug and chug.

Doing it with Cloudformation or Terraform or something built on top like Pulumi or the CDK is something I wouldn't recommend unless you do this for a living. Just stand up the infra manually- doing a bad job is like 10% of the work required to do a "good" infra as code job, and you don't sound interested in the project.

All your actual app stuff can be nice declarative k8s yaml, and you'll be good to go indefinitely.

samuraixp|3 years ago

I find this really annoying. There's also no concept of a package like Helm Charts which means you can distribute all your images on Docker Hub but then when it comes to compose files and configurations with templating your up @$%^ creek.

I was hoping something like https://dagger.io/ might solve this but they currently don't support compose files, or maybe https://www.acorn.io/ ?

It does feel like there is a legitimate space for a multi-container service application standard that isn't just kubernetes, unless perhaps things like k3d and docker desktop end up so good at running tiny k8s that docker compose dies.

SpeedilyDamage|3 years ago

I'm not really sure what problem you're trying to solve here; if you want to follow DRY, there are YAML tricks you can use[0], but if you want docker-compose to "figure it all out" itself, I don't think there are a safe set of assumptions it could make to reliably meet your expectations.

Also I'm not sure it really matters; again this would help to understand your question better, because I think you're trying to solve a problem that may not exist in your threat model.

[0] https://medium.com/@kinghuang/docker-compose-anchors-aliases...

qbasic_forever|3 years ago

Docker's network 'isolation' isn't as bulletproof as you imagine, especially if you're on Linux. From your host machine you can access every single service IP directly which is sometimes unexpected as you might assume it's firewalled off. I'm pretty sure long ago docker used to warn folks their network isolation isn't to be used in any production security sense, it's strictly for developer convenience. If the config is uneweildly you could just throw everything into one default network, as it's still just as secure as that when you're defining individual networks (i.e. neither is super secure).

kayodelycaon|3 years ago

Personally, I’ve never considered docker compose to have any security at all. The containers have IPs on an OS network device. Why would the host not have access?

Then again, I use openvpn and had to deal with network routing tables.

throwaway892238|3 years ago

> Is there a way to abstract away these extra configuration steps and keep it simple yet secure by default?

No, sorry.

The way it works in the cloud is, you build some infrastructure, you give that infrastructure the ability to contact a secrets management service, and you either A) configure a container orchestrator to look up a secret when your service starts and inject it at start time (either as an env var or as a file), or B) your service itself looks up the secret using permissions inherited from the environment it has started in (such as an instance metadata service).

You could fake all of that in a crappy way by running an additional service which is just a web server with a complicated random URL for each secret you want, configure each service to declare an environment variable which is the complicated random URL, and have each service curl the URL at start time to retrieve a secret. But that's not less configuration, it's slightly more. The only benefit is that you can kill the secret service after the URLs have been grabbed, leaving the secret only in memory (and assuming the URL was rotated every time this whole thing started, would make them temporary, sort of).

But, as a hacker, that would all only present small challenges which I would eventually work around once I find an attack to read from memory or execute code. So unless you need super duper extra security for some reason, just use what you've got.

withinboredom|3 years ago

> Is there a way to abstract away these extra configuration steps and keep it simple yet secure by default?

Secure from what? What is your threat model? It's hard to imagine 'default' security because everyone is worried about different things.

> I just want them to talk securely and minimize any attack vectors and keep any configuration to a minimum

Your application is more likely to expose a SQL injection vulnerability than someone gaining unauthorized access to your network. I'm not saying you shouldn't secure it, I'm just saying that it is probably pointless since your application has access to :all-the-things: and it is probably the weakest link.

Further, if you're doing this on a developer machine, you're just making it harder to debug issues. As the user of the host machine, you have access to everything by default.

> How do you do (not the configuration management) but manage an inventory of what configuration is needed to run your environment (or a subset for ci)?

Documentation, documentation, documentation. Basically, have a markdown file with each config item, it's value in each environment (not literal values, but ex: 'github token for production, with X,Y,Z scopes'), why it exists, and who is responsible for it.

Randomdevops|3 years ago

Being only able to connect to declared dependencies.

So say the application is compromised, it can't connect to the internet, from there it could only connect to the declared database and webservice. So those would need to have vulnerabilities too that could be exploiting from that end, hence limiting the blast radius.

So not really worried about physical access, but more in the lines of a RCE(Spring4Shell) probing the rest of the network or a supply chain attack that tries to send out data...

Randomdevops|3 years ago

Just saw Borg, Omega, Kubernetes: Lessons learned over a decade (2016) (acm.org) https://queue.acm.org/detail.cfm?id=2898444 pass by.

" Dependency Management

Standing up a service typically also means standing up a series of related services. If an application has dependencies on other applications, wouldn't it be nice if those dependencies (and any transitive dependencies they may have) were automatically instantiated by the cluster-management system? "

argentinian|3 years ago

I'd like to ask the questioner and to anybody who would like to comment:

As docker compose is only for single node deployments, how is your workflow from dev to production? If your containers are deployed in something like Kubernetes in production, isn't inconvenient to use docker compose for development?

justsomehnguy|3 years ago

> single node deployments

Docker Compose is for a single developer deployments [0].

Every time I mess with Compose and I need for the things to be services[1] I just give up after the some amount of reboots, YAML mangling, cursing, documentation reading, keyboard throwing. I just rebuild it as a Swarm service and it works. Till Swarm (on a single node) breaks, but this is another can of worms.

[0] On developer's machine. On developer's distro.

[1] be started after reboot, successfully start after reboot, with all dependencies. It doesn't seems like a high bar.. but Docker Compose fails at it.

NorwegianDude|3 years ago

I usually use Docker compose locally, and then I deploy to system containers.

Nearly all my time is spent on developing my own software, so it's not like I make infrastructure changes every month. I set up a deployment pipeline that deploys it the way I want, and then I don't touch it until my infrastructure needs to change.

So I'm usually not even using application containers in production, Docker is just a simple way to simulate production services so that anyone can run a single command and have a complete dev environment.

Docker swarm works well with compose.

Randomdevops|3 years ago

We don't switch technologies. So it's compose all the way to production.

As for Kubernetes, developers currently don't run it locally, they just run the app/apps they're developing from their IDEs and connect to a 'build' environment for the other services/databases (far from ideal)

KRAKRISMOTT|3 years ago

Simple, you pay for a PaaS. I know this is unpopular but it is why platform as a service exists — to abstract away the typical configuration patterns and let somebody else deal with the devops overhead.

Randomdevops|3 years ago

Not at all unpopular, but they just seem to tackle a single application and maybe a database... But AWS Beanstalk, Heroku, Fly.io,... They got the CD, autoscaling, iaas down, but have multiple applications talking to eachother and you're back to managing them manually it seems.

Clicking together some apps in a GUI is nice for some random tests, but I want some kind of manifest that can be promoted between environments.

owenfi|3 years ago

Probably not quite what you're looking for but perhaps along the right lines (and it looks like they've added some products and maybe features) https://traefik.io/traefik/

The gist of it is Traefik is a reverse proxy that learns about your containers via configuration in the container only, so you can stand up additional services without rebooting everything. Still config heavy, but maybe less than some other approaches.

rad_gruchalski|3 years ago

Traefik is pretty cool. Can get you very far with forward auth plugins.

Randomdevops|3 years ago

I use it extensively already :)

A few labels and magic!

huimang|3 years ago

I solve the secrets problem by using ansible to provision the host system and build the compose file, and ansible-vault to handle the secrets.

You can also dynamically build a compose file for example:

export COMPOSE_FILE="$(ls -1 services/*/compose.yml | paste -sd ":" -)"

In general I don’t have many issues with the docker-compose.yml format.

colordrops|3 years ago

Could try Nix with Agenix. Modularized and composable, and secrets can be stored encrypted in git.

xrd|3 years ago

This sounds interesting. Do you have a good write-up?

aristofun|3 years ago

Create your own yml based schema and a script converting it to docker compose syntax.

It sounds like a simple straightforward task.

klysm|3 years ago

I find systemd works pretty well

canadaduane|3 years ago

I'm curious if you run docker containers via systemd units? Or are you suggesting no docker?