Ask HN: A Better Docker Compose?
74 points| Randomdevops | 3 years ago
services:
proxy:
build: ./proxy
networks:
- frontend
app:
build: ./app
networks:
- frontend
- backend
db:
image: mysql
networks:
- backend
You add config to share credentials between services. services:
app:
build: ./app
environment:
DB_PASSWORD_FILE: /run/secrets/db_root_password
secrets:
- db_root_password
db:
image: mysql
environment:
MYSQL_ROOT_PASSWORD_FILE: /run/secrets/db_root_password
secrets:
- db_root_password
secrets:
db_root_password:
file: ./secrets/db_root_password.txt
Is there a way to abstract away these extra configuration steps and keep it simple yet secure by default?If you would express db as a dependency/resource of app, could you infer that you could put it in a seperate network and have the credentials automatically link? 'As a developer' I'm not really interested in the network specifics or which credentials, I just want them to talk securely and minimize any attack vectors and keep any configuration to a minimum. With tens of apps, their databases and transversal connections, how to do you keep the configuration to a minimum?
Googling around I found;
humanitec: https://docs.humanitec.com/integrations/resource-types/mariadb They express something as 'resources', a dependency type that can be provisioned automatically with inputs and outputs that then can be injected in the application env vars: mysql://${externals.my-db.username}:${externals.my-db.password}@${externals.my-db.host}:${externals.my-db.port}/${externals.my-db.name}
you're limited to a limited set of drivers ofcourse and how would you express an app1 to app2 dependency?
juju: https://juju.is/
Each app is packaged in a charm which seems to be a yaml declaring inputs, dependencies and other meta data and optional python code that can respond to certain lifecycle hooks
https://discourse.charmhub.io/t/implementing-relations/1051
name: my-node-app
...
requires:
database:
interface: mongodb
provides:
website:
interface: http
Things can seemingly be autowired based on what interface they provide and require?
So just make a list of apps until everything resolves?Does anyone have experience with these tools or others like it?
How do you stop others and yourself from drowning in credentials, certificates, env vars, jvm params and k8s yaml(times every environment) How do you do (not the configuration management) but manage an inventory of what configuration is needed to run your environment (or a subset for ci)?
achempion|3 years ago
I'm using docker swarm/compose extensively for multiple years and configuration can be a bit verbose sometimes but it's very simple system to maintain and write config for.
I think, splitting relatively small configs into smaller bits and using additional abstraction can actually increase complexity of the system compared to having everything in a single file as it's basically write and forget and not something you modify heavily once system is mature.
Randomdevops|3 years ago
But imagine the complete deployment looks like 2 monoliths, a dozen supporting webservices, 3 databases, redis, an elasticsearch cluster, ActiveMQ and a mailserver.
Then for a development track you only want to run monolith A (cost/resources/startup time) and the mailserver being MailHog so you don't accidentally send something to real addresses.
But you can't just split the 2 monoliths in separate compose files, as they share services/dbs. If you did that you would have to wire them in manually.
For local development you might only want to run webservice B (which has a db and redis as dependencies)
And indeed if you try to manage several files that need to be combined in a specific way things gets messy...
So then things seems to boil down to starting every service separately and relying on service-discovery, but then the lifecycle isn't tied together anymore. So if I want to tear them all down, how would I know which ones I spun up in this imaginary context?
Say I want to spin up an on demand integration environment for an application, what else do I start too?
mooreed|3 years ago
> “you can solve every problem by adding or removing an abstraction layer”
Have you considered generating your YAML/JSON config with something that composes?
If you are open to it you might be interested in dhall [1] as it’s a config language with variables, functions and imports.
I have used it for pet projects and I could see how it could offer some tidy encapsulation patterns for larger, more complicated production applications.
[1]: https://dhall-lang.org/
satvikpendem|3 years ago
oneplane|3 years ago
While Docker Desktop / Podman / Rancher Desktop combined with stuff like Skaffold aren't exactly a drop-in replacement for docker-compose, it does do a much better job at bringing up and tearing down entire compositions while re-using existing packaging and access controls.
If you are running docker-compose for non-development things, it might be a different story; it might be suitable for non-GitOps things, but as posted elsewhere, at that point you're better off using something like systemd.
When composing really small setups I either do this with a shell script (think 10 lines including trapping exits) or a systemd unit. Whenever it needs to be bigger I nearly always end up with an actual scheduler (K8S, Nomad) and GitOps because you can't really deliver something maintainable, available and durable anymore without it (well... I suppose if you have only 1 project to deliver, forever, you could manually manage it).
It does get a whole lot easier when you have a common foundation you can re-use. Spinning up an entire stack with security, monitoring, alerting, automatic rollouts/rollbacks for even the smallest project is just single-digit minutes work now.
Pulling in some other factors: how sharable/collaboratable is something these days if it is not built on similar enough technologies and modules? A solo yolo project might not care much about this, but when was the last time someone asked for software that is risky and not durable?
theteapot|3 years ago
What? I'm not involved and don't follow closely but pretty sure it's about as dead as docker itself. I.e. not dead. There was commits 8hrs ago -- https://github.com/docker/compose/. Not sure who did that if not "the community".
smcleod|3 years ago
solarkraft|3 years ago
KronisLV|3 years ago
> While Docker Desktop / Podman / Rancher Desktop combined with stuff like Skaffold aren't exactly a drop-in replacement for docker-compose, it does do a much better job at bringing up and tearing down entire compositions while re-using existing packaging and access controls.
I don't know about this: in my experience the mentioned alternatives are generally way more complex and cumbersome to initially set up, for smaller projects.
I've seen successful projects use Docker Compose locally for all of the dependencies that need to run in containers (sometimes the apps run locally, for easier debugging, other times they also run in containers for no need to setup a runtime). I've even seen Docker Compose be used in production for projects that don't need to scale that much - just one server in which these containers run not that dissimilarly from what you might get with systemd services on a single node.
And when the need to scale out finally arises, then something like Docker Swarm (which uses the Compose specification) has been enough in the majority of cases, when you don't need anything too fancy (e.g. network partitioning, a la Istio/Kiali). It's extremely simple to setup and works really well, especially with something like Portainer. In addition, its integration with Ansible is also pretty great - though I'd advise Ansible in most cases in general, when you manage the underlying servers.
And I know that many claim that Swarm is dead - though in my eyes it's also basically feature complete and still gets new releases and bug fixes alongside Docker (which others claim is also dead); a bit of caution and having migration paths (e.g. Kompose) doesn't hurt, of course. When you do need (or desire to) run something more "current", then Nomad becomes sufficient on the servers, but I haven't had the need to run it locally, ever - since your environment descriptions will still be reasonably simple, you can just use Docker Compose locally and the HCL equivalent on servers, provided that you have dev/test/staging environments before prod to test everything out on, even if this is a drawback.
If you do decide to go with Kubernetes, then in my experience k0s, K3s (developed by Rancher) or even MicroK8s are all excellent starting points, but at that point you'll probably also want Helm charts and will essentially need something like Skaffold locally. When you get to that point, however, I've found that dealing with the complexity of Helm charts and running Kubernetes locally has more overhead than just using Swarm.
> It does get a whole lot easier when you have a common foundation you can re-use. Spinning up an entire stack with security, monitoring, alerting, automatic rollouts/rollbacks for even the smallest project is just single-digit minutes work now.
This is an excellent point, however! Maybe the DevOps cultures I've seen haven't been strong enough, though, since this has never really been the case in my experience and each service had a bit of custom work to be done for these aspects.
> Pulling in some other factors: how sharable/collaboratable is something these days if it is not built on similar enough technologies and modules? A solo yolo project might not care much about this, but when was the last time someone asked for software that is risky and not durable?
Here's a counterpoint - while many out there might be in the sweet spot where they can just pay someone else to give them a managed Kubernetes cluster, what about those who can't? What about those who decide to use Kubernetes, but have to host and upgrade their own clusters? It can easily become less durable and more risky, especially if your DevOps/Ops people aren't familiar with all of the nuances of Kubernetes.
Right tool for the job and all. Sometimes it's a great choice, other times it's risky.
stokedbits|3 years ago
This will bring you closer to the deployment stack if you are deploying to Kubernetes. Then also let you leverage tools like kustomize to dry out your configurations.
There are some great projects like tilt, devspace, skaffold, etc that help facilitate deving on a local or remote cluster.
As far as configuration management that can be as simple as cascading kustomize configs or helm. Then leveraging something like vault. The point really is, if you start with Kubernetes you have way more flexibility with tooling and options to do whatever the heck you want.
Shameless plug I recently started a series on local Kubernetes development. It covers some of this with tilt and more. If you would like a specific thing covered here I can add an installment to it. https://youtube.com/watch?v=Nj55RDVwrIE&si=EnSIkaIECMiOmarE
ecuaflo|3 years ago
ditsuke|3 years ago
na4ma4|3 years ago
https://docs.docker.com/compose/environment-variables/
klntsky|3 years ago
Arion docs: https://docs.hercules-ci.com/arion/
Nix language: https://nixos.wiki/wiki/Nix_Expression_Language
pxc|3 years ago
There are other tools that model things more like the OP is thinking, though! With Disnix, you have an explicit service model where you declare those dependencies.
See: https://hydra.nixos.org/build/203347995/download/2/manual/ or maybe Sander's blog posts on it
fswd|3 years ago
I then disable iptables in /etc/docker/daemon.json: "iptables" : false to prevent docker from poking a hole in my firewall. (This should be on by default..) I also enable User name spaces with this: "userns-remap" : "default" so that the docker image runs isolated. I can disable this in my docker-compose with userns=host Actual instructions are slightly different, but: https://docs.docker.com/engine/security/userns-remap/
I also found I could disable networking all together on some services, and just expose a port. Some services don't need to connect to the net.
Then I run a cloudflare argo tunnel to the application. But you could also use a nginx reverse proxy, some other tunnel, or a one page express.js reverse proxy with pm2. There's a cloudflare argo tunnel docker-compose setup which allows you to type in your tunnel information in the config. I thought that was cool but I did not try it.
ecuaflo|3 years ago
The first solution I happened upon was serverless. Specifically SST, which is written with AWS CDK, but you must develop on live services and I just can’t justify paying to develop.
Then I found Serverless Framework, which is an abstraction on CloudFormation, but the offline solutions like localstack get a lot of flack for being buggy and localstack charges for some services. I also looked into Architect but the documentation is abysmal.
Then I figured serverful might be the easier way to go. I found that docker compose has a built in integration with AWS ECS where it transforms your yaml into Cloudformation to provision the right services. However, it seems to just be missing key parts like custom domain and SSL certificate provisioning which seems to defeat the IaC ethos.
Then I figured I might go with Terraform and I found some seemingly good starters like https://github.com/aws-ia/terraform-aws-ecs-blueprints https://github.com/cloudposse/terraform-aws-ecs-web-app https://github.com/turnerlabs/terraform-ecs-fargate but the examples are just lacking. They don’t have any examples for multiple containers that can access each others’ resources that I can find. Reading these templates has at least given me a better idea of the resources I need to provision in AWS but the networking and configuration still frighten me. Like do I need to configure nginx with a reverse proxy myself? How do I orchestrate that container with the others? And apparently services can crash and just not restart? And I need to make sure to configure volumes for data that needs to persist. And setting up the CI/CD seems daunting.
I’ve also heard about docker swarm, kubernetes, pulumi, AWS SAM, etc but it’s a lot to learn. When I go on Discords for web frameworks, mostly everyone including the devs of these frameworks use 2nd tier managed providers like Vercel, Fly, Netlify, Supabase, Cloudflare, etc. But many of those are just not as reliable as core cloud providers, the cost is way higher, and now you’re setting up a local stack that probably wildly differs from how it will work in production between those services. Glad to see I’m not alone in a very reasonable expectation of a simple way to orchestrate multiple containers on AWS, what must be the most common use case web developers have
latchkey|3 years ago
Take a step back and simplify everything.
Personally, I just use GCP cloud functions talking to a GCP managed postgres database. I don't worry about all the deploy stuff since their tooling takes care of it for me... heck, integrate it with github actions and it is just a git push to go live.
GCP is also next to free for low tier usage. Last I looked, a tiny postgres instance was one of the cheapest out there.
Then, just put cloudflare in front of GCP and now you've got the ability to use their functions in front of GCP too.
Fully scalable, relatively simple and low cost.
brundolf|3 years ago
Aren't they? I've never heard this, do you have a source?
> the cost is way higher
Depends. For small-medium stuff the cost may be "way higher" in that it's dollars instead of pennies per month, but you have to weigh that cost against the cost of your time (how many months of PaaS hosting did the past week cost you in dev hours?). I'm sure the economics become significant at some scale, but I'd ask yourself whether they are at your scale
> and now you’re setting up a local stack that probably wildly differs from how it will work in production between those services
That doesn't have to be true at all; many of these services let you ship Docker containers if you want to (and manage everything else from there), but it's possible to go without even that: https://news.ycombinator.com/item?id=34243045
I'd recommend taking a step back and seeing how much complexity is really necessary for what you're trying to do
klooney|3 years ago
Getting live with EKS in AWS is a little scary and painful and not cheap, but if you just work through https://www.eksworkshop.com/ it's pretty plug and chug.
Doing it with Cloudformation or Terraform or something built on top like Pulumi or the CDK is something I wouldn't recommend unless you do this for a living. Just stand up the infra manually- doing a bad job is like 10% of the work required to do a "good" infra as code job, and you don't sound interested in the project.
All your actual app stuff can be nice declarative k8s yaml, and you'll be good to go indefinitely.
samuraixp|3 years ago
I was hoping something like https://dagger.io/ might solve this but they currently don't support compose files, or maybe https://www.acorn.io/ ?
It does feel like there is a legitimate space for a multi-container service application standard that isn't just kubernetes, unless perhaps things like k3d and docker desktop end up so good at running tiny k8s that docker compose dies.
SpeedilyDamage|3 years ago
Also I'm not sure it really matters; again this would help to understand your question better, because I think you're trying to solve a problem that may not exist in your threat model.
[0] https://medium.com/@kinghuang/docker-compose-anchors-aliases...
qbasic_forever|3 years ago
kayodelycaon|3 years ago
Then again, I use openvpn and had to deal with network routing tables.
throwaway892238|3 years ago
No, sorry.
The way it works in the cloud is, you build some infrastructure, you give that infrastructure the ability to contact a secrets management service, and you either A) configure a container orchestrator to look up a secret when your service starts and inject it at start time (either as an env var or as a file), or B) your service itself looks up the secret using permissions inherited from the environment it has started in (such as an instance metadata service).
You could fake all of that in a crappy way by running an additional service which is just a web server with a complicated random URL for each secret you want, configure each service to declare an environment variable which is the complicated random URL, and have each service curl the URL at start time to retrieve a secret. But that's not less configuration, it's slightly more. The only benefit is that you can kill the secret service after the URLs have been grabbed, leaving the secret only in memory (and assuming the URL was rotated every time this whole thing started, would make them temporary, sort of).
But, as a hacker, that would all only present small challenges which I would eventually work around once I find an attack to read from memory or execute code. So unless you need super duper extra security for some reason, just use what you've got.
withinboredom|3 years ago
Secure from what? What is your threat model? It's hard to imagine 'default' security because everyone is worried about different things.
> I just want them to talk securely and minimize any attack vectors and keep any configuration to a minimum
Your application is more likely to expose a SQL injection vulnerability than someone gaining unauthorized access to your network. I'm not saying you shouldn't secure it, I'm just saying that it is probably pointless since your application has access to :all-the-things: and it is probably the weakest link.
Further, if you're doing this on a developer machine, you're just making it harder to debug issues. As the user of the host machine, you have access to everything by default.
> How do you do (not the configuration management) but manage an inventory of what configuration is needed to run your environment (or a subset for ci)?
Documentation, documentation, documentation. Basically, have a markdown file with each config item, it's value in each environment (not literal values, but ex: 'github token for production, with X,Y,Z scopes'), why it exists, and who is responsible for it.
Randomdevops|3 years ago
So say the application is compromised, it can't connect to the internet, from there it could only connect to the declared database and webservice. So those would need to have vulnerabilities too that could be exploiting from that end, hence limiting the blast radius.
So not really worried about physical access, but more in the lines of a RCE(Spring4Shell) probing the rest of the network or a supply chain attack that tries to send out data...
Randomdevops|3 years ago
" Dependency Management
Standing up a service typically also means standing up a series of related services. If an application has dependencies on other applications, wouldn't it be nice if those dependencies (and any transitive dependencies they may have) were automatically instantiated by the cluster-management system? "
argentinian|3 years ago
As docker compose is only for single node deployments, how is your workflow from dev to production? If your containers are deployed in something like Kubernetes in production, isn't inconvenient to use docker compose for development?
justsomehnguy|3 years ago
Docker Compose is for a single developer deployments [0].
Every time I mess with Compose and I need for the things to be services[1] I just give up after the some amount of reboots, YAML mangling, cursing, documentation reading, keyboard throwing. I just rebuild it as a Swarm service and it works. Till Swarm (on a single node) breaks, but this is another can of worms.
[0] On developer's machine. On developer's distro.
[1] be started after reboot, successfully start after reboot, with all dependencies. It doesn't seems like a high bar.. but Docker Compose fails at it.
NorwegianDude|3 years ago
Nearly all my time is spent on developing my own software, so it's not like I make infrastructure changes every month. I set up a deployment pipeline that deploys it the way I want, and then I don't touch it until my infrastructure needs to change.
So I'm usually not even using application containers in production, Docker is just a simple way to simulate production services so that anyone can run a single command and have a complete dev environment.
Docker swarm works well with compose.
Randomdevops|3 years ago
As for Kubernetes, developers currently don't run it locally, they just run the app/apps they're developing from their IDEs and connect to a 'build' environment for the other services/databases (far from ideal)
benignslime|3 years ago
[deleted]
KRAKRISMOTT|3 years ago
Randomdevops|3 years ago
Clicking together some apps in a GUI is nice for some random tests, but I want some kind of manifest that can be promoted between environments.
owenfi|3 years ago
The gist of it is Traefik is a reverse proxy that learns about your containers via configuration in the container only, so you can stand up additional services without rebooting everything. Still config heavy, but maybe less than some other approaches.
rad_gruchalski|3 years ago
Randomdevops|3 years ago
A few labels and magic!
huimang|3 years ago
You can also dynamically build a compose file for example:
export COMPOSE_FILE="$(ls -1 services/*/compose.yml | paste -sd ":" -)"
In general I don’t have many issues with the docker-compose.yml format.
derfabianpeter|3 years ago
[1] https://docs.polycrate.io
mdaniel|3 years ago
I think it's the new "JS framework syndrome" for everyone to invent their own license :-/
colordrops|3 years ago
yewenjie|3 years ago
xrd|3 years ago
aristofun|3 years ago
It sounds like a simple straightforward task.
klysm|3 years ago
canadaduane|3 years ago
benignslime|3 years ago
[deleted]