top | item 22575007

Ask HN: Which configuration management software would/should you use in 2020?

257 points| uaas | 6 years ago

What is your team using at work? What should be used at scale (FAANG, or similar)? What are you planning to switch to?

211 comments

order
[+] caleblloyd|6 years ago|reply
Not FAANG but for small to medium "cloud native" businesses I like to use this approach with minimal dependencies:

Managed Kubernetes cluster such as GKE for each environment, setup in cloud provider UI since this is not done often. If you automate it with terraform chances are next time you run it, the cloud provider has subtly changed some options and your automation is out-of-date.

Cluster services repository with Helm charts for ingress controller, centralized logging and monitoring, etc. Use a values-${env}.yaml for environment differences. Deploy with CI service such as Jenkins.

Configuration repository for each application with Helm Chart. If it's an app with one service or all services in a single repo this can go in the same repo. If it's an app with services across multiple repos, create a new repo. Use a values-${env}.yaml for environment differences. Deploy with CI service such as Jenkins.

Store secrets in cloud secrets manager and interpolate to Kubernetes secrets at deploy time.

Cloud provider keeps the cluster and VMs up-to-date, CI pipelines do the builds and deployments. No terraform/ansible/other required. Again, this only works for "cloud native" models.

[+] asey|6 years ago|reply
Our setup is quite similar to this. Some differences - each environment is represented as a helm parent chart with each application being a child chart. Each environment chart has it's own repo where values.yaml supplies environment specific overrides for each application. Each application has its own repo where the helm charts and application source both reside.
[+] igetspam|6 years ago|reply
Similar to what I've done.

Applications have a unique source repo. Each rep has a build dir. Build dir contains a sundir for docker build, terraform configs (if needed) for dependent infrastructure and a helm chart for deploy.

I have a few things that don't fit the microservice pattern. They are terraform first (root of repo is TF code) and they have a build dir to define the next steps (mostly packer).

As I'm writing this, I think I need to change that and make the top level a README.md file and use the build dir pattern to be consistent.

[+] nickthemagicman|6 years ago|reply
Yeah, in a decent architecture the only place state is located is in the datastore layer.

The goal is to make servers disposable, able to be destroyed and created at will, so configuration management becomes kind of a legacy technology at that point.

[+] DelightOne|6 years ago|reply
Are you using Jenkins or Jenkins X for the deployment?
[+] phaer|6 years ago|reply
I still prefer the Open Source edition of https://puppet.com/ to manage larger, diverse environments - which may include not just servers, but workstations, network appliances and so on. It's well established with lots of quite portable modules. But it can also be a bit on the slower side and comes with a steeper learning curve then some of the others.

https://www.ansible.com/ is surely a good solution for Bootstraping Linux cloud machines and can be quite flexible. I personally feel like its usage of YAML manifests instead of a domain-specific language can make complex playbooks harder to read and to maintain.

If all you do is to deploy containers on a managed Kubernetes or a similar platform, you might get away with some solution to YAML templating (jsonnet et al) and some shell glue.

I am keeping an eye on https://github.com/purpleidea/mgmt which is a newer contender which many interesting features but lacks more complex examples.

Others like saltstack and chef still see some usage as far as I know, but I've got no personal experience with them.

[+] brightball|6 years ago|reply
I favor Ansible for 2 main reasons:

- If you have SSH access, you can use it. No matter what environment or company you work for, there’s no agent to install and no need to get approval to use the tool. It’s easy to build up a reproducible library of your shell habits that works locally or remotely, where each step can avoid being repeated in case there’s a need to rerun things.

- If you get into an environment where performance across many machines is more important you can switch to pull based execution. Because of that, I see very little advantage to any of the other tools that outweighs the advantages of Ansible.

[+] grrywlsn|6 years ago|reply
I'm curious why people use configuration management software in 2020. All of that seems like the old way of approaching problems to me.

What I prefer to do is use Terraform to create immutable infrastructure from code. CoreOS and most Linux variants can be configured at boot time (cloud-config, Ignition, etc) to start and run a certain workload. Ideally, all of your workloads would be containerised, so there's no need for configuration drift, or for any management software to be running on the box. If you need to update something, create the next version of your immutable machine and replace the existing ones.

[+] notacoward|6 years ago|reply
"Immutable infrastructure" what a laugh. In a large deployment, configuration somewhere is always changing - preferably without restarting tasks because they're constantly loaded. We have (most) configuration under source control, and during the west-coast work day it is practically impossible to commit a change without hitting conflicts and having to rebase. Then there are machines not running production workloads, such as development machines or employees' laptops, which still need to have their configuration managed. Are you going to "immutable infrastructure" everyone's laptops?

(Context: my team manages dozens of clusters, each with a score of services across thousands of physical hosts. Every minute of every day, multiple things are being scaled up or down, tuned, rearranged to deal with hardware faults or upgrades, new features rolled out, etc. Far from being immutable, this infrastructure is remarkably fluid because that's the only way to run things at such scale.)

Beware of Chesterton's Fence. Just because you haven't learned the reasons for something doesn't mean it's wrong, and the new shiny often re-introduces problems that were already solved (along with some of its own) because of that attitude.

[+] king_phil|6 years ago|reply
But... How do you configure the hosts where your containers are running on? How do you configure your storage (NAS/SAN)? How do you configure your routers and switches? ...
[+] mneil|6 years ago|reply
I'm going to agree with you. In 2020 (and really the last few years), configuration management is outdated. IaC (infrastructure as code) is the current approach. Containerize everything you can, use terraform or cloudformation, or azure devops.

Avoid managing the underlying os as much as possible. Use vanilla or prebuilt images to deploy these containers on, coreos, Amazon's new bottle rocket (maybe). Or use a service like fargate when possible. All configuration should be declarative to avoid errors.

If you need to build images tools like packer are great. AWS has a recommended "golden Ami pipeline" pattern and a new image builder service if you can't use community images.

I'm speaking imperatively but read these as my own directives. I work for a company that consults and actively helps fortune 500's migrate to the cloud. So some of what I'm saying is not possible or harder on prem and I recognize that.

If I had to, I still like Chef, puppet second favorite mostly because of familiarity. Ansiblee can be used with either of these. And tools like serverspec to validate your images. I don't really use any of this anymore though.

[+] skywhopper|6 years ago|reply
But you still need to configure things, even if they are immutable at runtime. And you need to manage that configuration over time in some systematic way.

You always have a configuration management system.

[+] dillonmckay|6 years ago|reply
Not all of us have the luxury of our projects being greenfield.
[+] inshadows|6 years ago|reply
What about the system that runs the containers?

"Amazon/Google/Azure takes care of that" is not the answer, unless your comment is predicated on a world where compute can only be rented from big corps... and their methods of managing underlying infrastructure are sacred secrets for which we are to unworthy to comprehend.

[+] mleonhard|6 years ago|reply
Terraform keeps track of resources it creates. One can remove resources (VMs, managed databases, persistent volumes, DNS records) from the config file and Terraform will cleanly delete them. This is a crucial feature for most deployments.

For example, I deployed an app backend to DigitalOcean with a load balancer, 2x replicated API server, 2x replicated worker process, managed database, file storage, and static website. Terraform is tracking 114 resources for each deployment.

It seems that automatic removal is poorly supported by Ansible, Chef, Puppet, Salt, etc. One can explicitly add items to a "to_remove" list, but this is error-prone.

Terraform has many limitations and problems, but I have found no better tool.

[+] whatsmyusername|6 years ago|reply
I mostly don't for new stuff (all in on Docker/ECS), however we have a lot of old stuff and things in the process of being migrated where it makes sense. There's also always the odd bird thing you use that needs to run on a regular host.
[+] z9e|6 years ago|reply
For mutable infra that holds state. IMO not all infra is gonna to end up in k8s and some still needs to be self hosted.
[+] notyourday|6 years ago|reply
> What I prefer to do is use Terraform to create immutable infrastructure from code.

Can you mount all your volumes read-only and run all of your stack? If you cannot, then you do not have immutable infrastructure. You simply happen to agree that no one write anything useful, which with time will absolutely fail because someone, somewhere is going to start storing state on a stateless system giving you "a cow #378 called 'Betsy'"

[+] detaro|6 years ago|reply
So, Linux-only? ;)
[+] b5n|6 years ago|reply
Surprised more people here are not using Salt. Having used both Salt and Ansible, I much prefer Salt, especially when working with larger teams.

When working solo I use Guix, both Guix and Nix are _seriously_ amazing.

[+] yani|6 years ago|reply
Salt has much nicer configs and feels supeior to Ansible. The main disadvantage that I had with salt was the need to have a salt master server. I read that this is no longer needed but I have not tried it myself. Keeping secrets outside of the repo was not trivial task, ansible has an easy way to encrypt secrets.
[+] apple4ever|6 years ago|reply
Wow really? I tried to learn Salt and it was way too complex. Comparatively Ansible was amazing to learn.
[+] skrebbel|6 years ago|reply
What's salt? Any link? I found something called SaltStack but that appears to be enterprise security software.
[+] perlgeek|6 years ago|reply
I use Ansible, mostly because it works pretty well for deployments (on traditional, non-dockerized applications), and then I can just gradually put more configuration under management.

So it's a very good tool to gradually get a legacy system under configuration management and thus source control.

[+] rootforce|6 years ago|reply
My default tends to be Ansible because it is really versatile and lightweight on the systems being managed. That versatility can bite you though because it's easy to use it as a good solution and miss a great one. Also, heaven help you if you need to make a change on 1000s of hosts quickly.

I also use (In order of frequency): Terraform, Invoke (Sometimes there is no substitute for a full programming language like python), Saltstack (1000's of machines in a heterogenous environment)

If I were going to deploy a new app on k8s today, I would probably use something like https://github.com/fluxcd/flux.

I haven't really had a pleasant time with the tooling around serverless ecosystem yet once you get beyond hello worlds and canned code examples.

[+] yjftsjthsd-h|6 years ago|reply
> Also, heaven help you if you need to make a change on 1000s of hosts quickly.

Why? I would have seen that as Ansible's strong point.

[+] witcher|6 years ago|reply
I might be fanboy of the type safety and having a quick feedback loop, but I cannot imagine a better configuration management system than just straight configuration as code e.g. in Go: https://github.com/bwplotka/mimic

I really don't see why so many weird, unreadable languages like jsonnet or CUE were created, if there is already a type safe, script-like (Go compiles in miliseconds and there is even go run command), with full pledged IDE autocompletion support, abstractions and templating capabilities, mature dependency management and many many more.. Please tell me why we are inventing thousands weird things if we have ready tools that helps with configuration as well! (:

[+] beders|6 years ago|reply
I agree. I wish we could just use EDN and Clojure, but your DevOps guy is not writing Go or Clojure code.

They are also not doing code reviews to enforce security policies.

If you have DevOps guys who are also software developers, more power to you, but if I approach my DevOps team with:

Hey just code your scripts in this turing-complete languages, they will ask me "what's your username again?" BOFH-style ;)

[+] wpietri|6 years ago|reply
Please consider that you're a principal engineer with a BS and a Master's. And you've achieved all those things quite quickly! You're on the far end of a bell curve.

A full programming language is the natural choice for people who are full programmers. But for people who aren't, they're intimidating and add a lot of complexity. Templating systems are much more approachable for people who have a lot of experience configuring things via big blobs of text.

As a programmer, I would personally rather express everything in a programming language, so I get your perspective here. But it isn't an accident that there are so many ops-focused systems that are different takes on just automating the things people were previously doing manually.

[+] detaro|6 years ago|reply
The tool you link recommends "kubectl apply, ansible, puppet, chef, terraform" to actually apply the changes, at least 3 of those I'd classify as configuration management. Generating the configuration is only a small part of it, and the traditional tools typically have some way to do that too because they were designed to be used by non-/almost-non-coders too.
[+] aganame|6 years ago|reply
Hashicorp tools are quite solid, and give you a lot for free. Ansible can automate host-level changes in places where hashicorp cannot reach. There shouldn't be many such places.

Alternatively, if you have the option of choosing the whole stack, Nix/NixOS and their deployment tools.

I would recommend staying away from large systems like k8s.

[+] maximilianburke|6 years ago|reply
Here's what we're using which I'm pretty happy with:

0. Self-hosted Gitlab and Gitlab CI.

1. Chef. I'd hardly mention it because it's use is so minimal but we have it setup for our base images for the nitpicky stuff like connecting to LDAP/AD.

2. Terraform for setting up base resources (network, storage, allocating infrastructure VMs for Grafana).

3. Kubernetes. We use a bare minimum of manually maintained configuration files; basically only for the long-lived services hosted in cluster plus the resources they need (ie: databases + persistent volumes), ACL configuration.

4. Spinnaker for managing deployments into Kubernetes. It really simplifies a lot of the day-to-day headaches; we have it poll our Gitlab container repository and deploy automatically when new containers are available. Works tremendously well and is super responsive.

[+] tilolebo|6 years ago|reply
We use Ansible with Packer to create immutable OS images for VMs.

Or Dockerfile/compose for container images.

Cloud resources are managed by Terraform/Terragrunt.

[+] mikepurvis|6 years ago|reply
I think this is the ideal scenario for Ansible— one-time configuration of throwaway environments, basically as a more hygenic and structured alternative to shell scripts.

My experience trying to manage longer lived systems like robot computers over time with Ansible has been that it quickly becomes a nightmare as your playbook grows cruft to try to account for the various states the target may be coming from.

[+] uranium235|6 years ago|reply
You can never go wrong with bash, you should not put secrets in 169.254.169.254 metadata and you should not have IAM profiles that have overreaching privileges. Any IAM profile that you use or whatever you use on azure or gcp you should always consider what somebody can do with it if they get access to it.
[+] ratiolat|6 years ago|reply
Salt because it's declarative and runs on linux, windows and osx.
[+] aprdm|6 years ago|reply
I have been using Ansible for over four years now, my current use case has around 1k VMs and a handful of baremetal in a couple of different datacenters running 100s of services.

No orchestration as well FWIW, we usually have ansible configuring Docker to run and pulling the images...

As for the future I have been meaning to explore Terraform and some Orchestration platforms (Nomad).

[+] polcia|6 years ago|reply
I would go with Ansible for side projects/smaller tasks, and use Puppet at large.
[+] geofft|6 years ago|reply
Shameless plug for a thing I maintain, which is in the config management space but a little bit different from the usual tools: https://github.com/sipb/config-package-dev#config-package-de...

config-package-dev is a tool for building site-specific Debian packages that override the config files in other Debian packages. It's useful when you have machines that are easy to reimage / you have some image-based infrastructure, but you do want to do local development too, since it integrates with the dpkg database properly and prevents upgraded distro packages from clobbering your config.

My current team uses it - and started using it before I joined the company (I didn't know we were using it when I joined, and they didn't know I was applying, I discovered this after starting on another team and eventually moved to this team). I take that as a sign that it's objectively useful and I'm not biased :) We also use some amount of CFEngine, and we're generally shifting towards config-package-dev for sitewide configuration / things that apply to a group of machines (e.g. "all developer VMs") and CFEngine or Ansible for machine-specific configuration. Our infrastructure is large but not quite FAANG-scale, and includes a mix of bare metal, private cloud and self-run Kubernetes, and public cloud.

I've previously used it for

- configuring Kerberos, AFS, email, LDAP, etc. for a university, both for university-run computer labs where we owned the machines and could reimage them easily and for personal machines that we didn't want to sysadmin and only wanted to install some defaults

- building an Ubuntu-based appliance where we shipped all updates to customers as image-based updates (a la CrOS or Bottlerocket) but we'd tinker with in-place changes and upgrades on our test machines to keep the edit/deploy/test cycle fast

[+] smotti|6 years ago|reply
Ansible for dev boxes or smaller deployments. For large-scale deployments CFEngine3. When deployed within a cloud environment one doesn't even need a master node for CFE3 but the agents can just pull the latest config state from some object storage.
[+] cfgmaster|6 years ago|reply
If you want massive parallel remote script execution, none beat gnu parallel or xargs + "ssh user@host bash < yourscript.sh".

All of cofiguration management tools( ansible, puppet, chef, salt etc ..) are bloated.

We already have FINE SHELL. Why do we need crappy ugly DSL or weird yaml ??

These days, Newbies write ansible playbooks without even basic unix shell & commands knowledge. What the hell?

I like ssh + pure posix shell approach like

Show HN: Posixcube, a shell script automation framework alternative to Ansible https://news.ycombinator.com/item?id=13378852