This is where Saltstack really shines, IMO. Once you have the minion installed, you have a ZMQ command channel on each of those boxes.
salt -G "roles:whatever" cmd.run "blahblahblah"
Easy to target by any number of things - name, OS, IP subnet, custom metadata, etc. Very powerful, and makes intelligent administration of a fleet a lot easier.
For example, I recently used this to audit our fleet for vulnerable PHP+nginx installs, regarding the recent CVE:
While we'll upgrade PHP on any systems it's installed on (also easy to find), this gave me a high-priority hitlist of all machines that needed PHP upgrades because they were running active php-fpm + nginx combinations.
Yep, salt has the best of all elements. You want simple remote execution? Got it. Agentless? Sure, use salt-ssh. Event bus? ZeroMQ and RAET included. Automation? Sure, reactors cover that. Integration with other systems? Got that too, state, pillar, mine, external events, all included. Even a REST interface.
There isn't much of a point any more to use bare SSH, heck, you can even use proxy minions to control systems that only have telnet or outdated SSH (like SSH1 or SSH2 with unsupported ciphers), or plain web interfaces with no API.
I've used clusterssh in the past, but nowadays Ansible is simply a better choice. clusterssh is rather brittle.
I don't much like Ansible either (I have opinions about using YAML as a programming language), but for ad-hoc maintenance of a bunch of servers, it works okay.
I am also a "reluctant" Ansible user, and I suspect I share some (perhaps many) of your opinions on YAML, but it is a fairly functional tool set for most administrative situations. There are a few situations where I can't use ansible and have to fall back on pure ssh for one reason or another, and this looks like it might be a useful tool.
Better than writing a bash script with a for loop and nested ssh commands at least.
I was going to say the same: I've used clusterssh and similar in the past, but these day I am using Ansible for doing those tasks. Primarily, the task I used it for was doing OS updates across the fleet.
Now I do "ansible-playbook dist-upgrade.yml --tags prod" and it takes care of:
- Running on a subset of hosts at a time.
- Pinning my own application packages so only the OS packages are updated.
- Removing from the load balancer.
- Doing the updates.
- Adding back into the load balancer.
- Waiting for the server to go healthy in the LB.
- Unpinning the application packages.
I'll say: I use Ansible a lot, so I've become more comfortable with the "using YAML as a programming language", but I don't disagree with you there. It works, and you can become comfortable with it, and it isn't as bad a fit for configuration management as the base statement would make it seem, but there are things that it's a bad fit for. In particular, the looping, especially with notifications.
I used to use tmux to do this. Split the screen half a dozen times. Connect each pane to a separate instance. Enter `setw synchronize-panes on` and you are now running commands on 6 instances simultaneously. From there you can run `htop` to see resources being used on all instances on one monitor, run `apt-get` commands, etc.
I'm seeing a lot more love for ansible than any of the other competitors in the cluster/pool management sphere. I didn't know HN had reached a (rough) consensus on this, is ansible well-enough-accepted as the configuration management tool of choice when not integrating into a larger cluster management system (e.g. Kubernetes)?
I’m not a sysadmin type at all, and I haven’t used the competitors really. But it’s so easy to use! You don’t need to install anything on the servers. It’s basically just a way to run ssh scripts on remote servers in a way that makes complete sense.
My impression of other tools, like Chef, is that they try to abstract away configuration and setup where ansible just feels what I would try to build if I had a bunch of ssh shell scripts. You’re always pretty close to the actual commands being run.
It really shines because it's agentless. If you were to ask me, if I have to manage the installation of the CM tool, I'm picking Ansible because I don't have to deal with making sure agents run, etc. If another team is going to manage the agents, I like Saltstack (but their agents are somewhat brittle in my experience).
Saltstack offers some very cool orchestration features, and being built on ZMQ offers some interesting capabilities.
Also, with it not requiring an agent, it's feasible for you to use it as an admin even if the rest of the team doesn't adopt it. Other tools that run on agents generally require some amount of higher level buy-in on using the tool.
Ansible has not taken the sysadmin world completely. Google uses Puppet, many companies still use saltstack/chef or even just bash scripts via jenkins. Ansible is a very very very useful tool. Even for kubernetes management.
I don't know, many people seem to love it. For what I've seen and the crazy amount of repos we have with ansible scripts and yaml and crazy string problems I'd say I completely hate it with passion. But I'm not really into the devops side of things, so maybe you need to get much more deeper into it to "appreciate" the insanity of yaml files and multiple repos.
Ansible is really easy to get started. It's always as obvious how to split larger codebases in abstractions (as compared to Chef, Puppet etc), but firing away jobs at a group of named hosts is simple.
An even simpler tool is Python Fabric, which is closer to something like Expect (but in Python and over SSH).
Mate... I manage hundreds of servers with it... including EKS Clusters lifecycles + creation of golden images, several databases stacks and AWS/Cloud provisioning.
It's far far far better than anything else... (specially Chef... It's light years ahead of Chef!)
I feel like often people forget about the small to medium setups run by "The IT guy." I'm a jack of all trades technologist in a major theater. One hour I'm configuring our half dozen Linux servers. The next hour I'm creating AutoCAD drawings, the following hour I'm changing SNMP settings on our 2 dozen Cisco switches, and after lunch I'm designing the electronics for a new stage prop.
I don't have the hours and days to learn Ansible, and it doesn't make sense for such a small environment. And it's Yet Another Service my successor will have to learn in what is already an incredibly niche position.
However doing the same half-dozen steps on a half-dozen servers takes up real time. Tools like cssh, that take 15 minutes to learn and have an IMMEDIATE time saving payoff, are invaluable. And my successor can easily skip using the tool and just do things the hard way until they do have the time to learn more efficient tools.
Ansible's ad-hoc is the thing that you can learn really fast. No need to dive into the all this idempotent paradigm or try to fully grasp the architecture of your configuration management solution to do the basics (looking at you, Chef). Just need to update sudo on 30 boxes? CLI and inventory file - that's all you need.
On the other side, do stuff like this all the time and you'll start to think that Ansible is just some kind of distributed shell executor on the steroids. Absolutely not. While other CM solutions tend to strictly put you inside their way of doing things from the very beginning, Ansible gives you the freedom about how you use it. That's why Ansible is a way to go tool not only for serious infrastructure engineering, but also a good helper for any IT-related tasks.
> I don't have the hours and days to learn Ansible
Really? Are you that filled or you don't want to use a minute of your weekend?
I've set up a few ansible tasks for occasional use like updating servers but I think it's worth learning. It's especially good when installing a new server. Run and it's to your usual state in no time.
The good part about task runners is that it is self documenting of your repetitive tasks and easy to share. You could create a shell script with comments but ansible is probably more portable.
I used clusterssh in the past and it is really good at sending commands to multiple machines. However for any real work, I would strongly recommend keeping the typing to a bare minimum and do all your work inside a well tested script. Better yet, use ansible or something like it to manage multiple servers
> I would strongly recommend keeping the typing to a bare minimum
Agreed.
> and do all your work inside a well tested script
At a minimum, with `set -ex` at the top. But I've stopped using (ba|fi||t?c)?sh scripts and switched to standard Makefiles for all my deployments. It requires changing the way you would normally code/script actions or interactions, but you get deterministic results.
I started playing around with self-hosting some things I use a few months ago and I was thinking about using Ansible but it just seemed way too complex.
I wrote a similar tool[1] for a cybersecurity competition I was helping to red team for. It would try a dictionary of username and password combos against a list of hosts generated from the results of a masscan[2], once it logged it it would run a bash script on the host to set up our persistence.
From there it would keep a session open on each host and allow you to run commands on a single host, a subset of hosts, or all hosts.
The advantage of this over hydra or some other SSH brute forcer is that it allows us to run our persistence tooling right away after finding a login and keep that SSH session alive so we can re-use it even if the password is changed.
The code is a tire fire, but it worked well for what we needed :)
There is plethora of similar tools (cssh, pssh, dsh) but Ansible ad-hoc mode superseeds these ones at any real task involving "simultaneous" management of Linux boxes.
mpssh and kash are also good. kash is part of the kanif perl project, but the one thing about it I really like that I don't see other projects doing is that it will aggregate similar output before it spits it back.
Thus if you run it to check for a package version on 300 servers, you can get maybe 2-3 sets of output grouped by host based on the output vs 300 lines of output.
If you're using a linux distribution, then the `terminator` terminal app supports multiplexing as part of its built in feature set and has been my default terminal of choice for a while (tabs, splitting, etc...)
I want to add a note to this - if you're using something like this to manage linux servers you're almost certainly doing it wrong. At a server level use ansible, puppet or chef.
One should be thinking of using terraform and something like chef/puppet/ansible or even better yet moving to kubernetes if you find yourself having these kind of problems.
Yes, but sometimes you can break your configuration management in such a way that it can't recover and need a rapid way to fix things or see the state of the world. It's very handy to have a tool ready to go that can assist. Sure, it's very dangerous when operating on an entire fleet, but break glass in an emergency.
It's also rather handy when you want to run ad hoc queries on your machine, e.g. which kernels are out there, is this leftover rpm installed somewhere, etc.
At a past job we had an in-house tool like this that also logged all of the commands anyone had ever ran with it and saved the stdout/stderr output in a webui+cli output. If you suspected somebody did something clowny, you can go look at exactly what they did, when, and what the result was. This logging was very important, imho and tools like it should have it.
Managing servers using parallel ssh is a way old school (and powerful) technique I don't recommend doing unless using it as a transport mechanism for a config management tool.
A company I worked at developed a custom SSH tool that would tie into your CMDB, kind of like a home rolled ansible you could use to blast commands to whole a PoP or even the whole network at the same time.
The thing was insanely powerful, but using it was like wielding a machete through a delicate garden.
Ansible is really great in this regard since you have the ubiquity of SSH but sane management, not some difficult to maintain shell script.
At scale parallel ssh performs poorly in comparison to message queues. As mentioned elsewhere in this thread, SaltStack defaults to using ZMQ for communication which has much lower overhead than SSH (with trade offs as well).
Folks are acting like the only use case for this is sending commands for deploying software or something. Even in fully automated environments at scale I used Cluster SSH extensively in my past life for things like grepping local logs on all members of a cluster at the same time, or querying local status. It's immensely helpful when troubleshooting issues in large environments.
Of course you shouldn't be redneck deploying stuff with Cluster SSH when you could write an Ansible Playbook or deploy Chef/Puppet/Salt. But to troubleshoot issues and manage host/OS level functionality on multiple identical systems at the same time, it's invaluable.
Not sure why there's such an Ansible yankfest going on - Ansible is a solution for problems many of us would rather simply avoid - but it's starting to make me believe that there are too many fake accounts used for marketing purposes. We know they're on Facebook, we know they're on Reddit, but Hacker News?
Do you actually have any evidence that such a conspiracy is real?
I mean, I've seen plenty of people jumping to wild assumptions and invent all sorts of conspiracy theories just because they are faced with the fact that other people have different opinions and tastes, and for some reason that is impossible unless there's a vast conspiracy to push ideas that don't match their personal whims and tastes.
I've used Ansible in the past. Ansible sends python scripts over SSH to hosts and runs them remotely. Users specify the state they want the system to be in and the little python scripts run all the checks and apply all changes. Does it take a conspiracy to prefer this approach over simply multicasting SSH connections to multiple hosts?
[+] [-] cheald|6 years ago|reply
For example, I recently used this to audit our fleet for vulnerable PHP+nginx installs, regarding the recent CVE:
While we'll upgrade PHP on any systems it's installed on (also easy to find), this gave me a high-priority hitlist of all machines that needed PHP upgrades because they were running active php-fpm + nginx combinations.[+] [-] oneplane|6 years ago|reply
There isn't much of a point any more to use bare SSH, heck, you can even use proxy minions to control systems that only have telnet or outdated SSH (like SSH1 or SSH2 with unsupported ciphers), or plain web interfaces with no API.
[+] [-] chousuke|6 years ago|reply
I don't much like Ansible either (I have opinions about using YAML as a programming language), but for ad-hoc maintenance of a bunch of servers, it works okay.
[+] [-] AcerbicZero|6 years ago|reply
Better than writing a bash script with a for loop and nested ssh commands at least.
[+] [-] linsomniac|6 years ago|reply
Now I do "ansible-playbook dist-upgrade.yml --tags prod" and it takes care of:
- Running on a subset of hosts at a time. - Pinning my own application packages so only the OS packages are updated. - Removing from the load balancer. - Doing the updates. - Adding back into the load balancer. - Waiting for the server to go healthy in the LB. - Unpinning the application packages.
I'll say: I use Ansible a lot, so I've become more comfortable with the "using YAML as a programming language", but I don't disagree with you there. It works, and you can become comfortable with it, and it isn't as bad a fit for configuration management as the base statement would make it seem, but there are things that it's a bad fit for. In particular, the looping, especially with notifications.
[+] [-] yjftsjthsd-h|6 years ago|reply
[+] [-] kempbellt|6 years ago|reply
[+] [-] xorcist|6 years ago|reply
I guess I'd like to split it in groups of panes and maybe switch which groups is currently visible?
[+] [-] mrintegrity|6 years ago|reply
bind-key v setw synchronize-panes
[+] [-] ktopaz|6 years ago|reply
<run-on-all.sh>
All one needs to do is call it like so: ./run-on-all.sh /path/to/cluster/file/list-of-servers-here.txt "sleep 60; reboot"[+] [-] ComputerGuru|6 years ago|reply
[+] [-] dickeytk|6 years ago|reply
My impression of other tools, like Chef, is that they try to abstract away configuration and setup where ansible just feels what I would try to build if I had a bunch of ssh shell scripts. You’re always pretty close to the actual commands being run.
[+] [-] curryst|6 years ago|reply
Saltstack offers some very cool orchestration features, and being built on ZMQ offers some interesting capabilities.
Also, with it not requiring an agent, it's feasible for you to use it as an admin even if the rest of the team doesn't adopt it. Other tools that run on agents generally require some amount of higher level buy-in on using the tool.
[+] [-] farisjarrah|6 years ago|reply
[+] [-] diaz|6 years ago|reply
[+] [-] xorcist|6 years ago|reply
An even simpler tool is Python Fabric, which is closer to something like Expect (but in Python and over SSH).
[+] [-] lewaldman|6 years ago|reply
It's far far far better than anything else... (specially Chef... It's light years ahead of Chef!)
[+] [-] burntwater|6 years ago|reply
I don't have the hours and days to learn Ansible, and it doesn't make sense for such a small environment. And it's Yet Another Service my successor will have to learn in what is already an incredibly niche position.
However doing the same half-dozen steps on a half-dozen servers takes up real time. Tools like cssh, that take 15 minutes to learn and have an IMMEDIATE time saving payoff, are invaluable. And my successor can easily skip using the tool and just do things the hard way until they do have the time to learn more efficient tools.
[+] [-] pepemon|6 years ago|reply
On the other side, do stuff like this all the time and you'll start to think that Ansible is just some kind of distributed shell executor on the steroids. Absolutely not. While other CM solutions tend to strictly put you inside their way of doing things from the very beginning, Ansible gives you the freedom about how you use it. That's why Ansible is a way to go tool not only for serious infrastructure engineering, but also a good helper for any IT-related tasks.
[+] [-] h1d|6 years ago|reply
Really? Are you that filled or you don't want to use a minute of your weekend?
I've set up a few ansible tasks for occasional use like updating servers but I think it's worth learning. It's especially good when installing a new server. Run and it's to your usual state in no time.
The good part about task runners is that it is self documenting of your repetitive tasks and easy to share. You could create a shell script with comments but ansible is probably more portable.
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] Accujack|6 years ago|reply
Whoa! It's really a niche position if time is proceeding backwards at your job.
Or, possibly you meant "successor" :)
[+] [-] scriptdevil|6 years ago|reply
[+] [-] ComputerGuru|6 years ago|reply
Agreed.
> and do all your work inside a well tested script
At a minimum, with `set -ex` at the top. But I've stopped using (ba|fi||t?c)?sh scripts and switched to standard Makefiles for all my deployments. It requires changing the way you would normally code/script actions or interactions, but you get deterministic results.
[+] [-] carlsborg|6 years ago|reply
[+] [-] t0astbread|6 years ago|reply
[+] [-] sdmike1|6 years ago|reply
From there it would keep a session open on each host and allow you to run commands on a single host, a subset of hosts, or all hosts.
The advantage of this over hydra or some other SSH brute forcer is that it allows us to run our persistence tooling right away after finding a login and keep that SSH session alive so we can re-use it even if the password is changed.
The code is a tire fire, but it worked well for what we needed :)
[1] https://github.com/sdshlanta/ssher
[2] https://github.com/robertdavidgraham/masscan
[+] [-] justinjlynn|6 years ago|reply
[+] [-] pepemon|6 years ago|reply
[+] [-] res0nat0r|6 years ago|reply
Thus if you run it to check for a package version on 300 servers, you can get maybe 2-3 sets of output grouped by host based on the output vs 300 lines of output.
https://github.com/ndenev/mpssh
http://taktuk.gforge.inria.fr/kanif/
[+] [-] slumos|6 years ago|reply
[+] [-] BuildTheRobots|6 years ago|reply
[+] [-] wastholm|6 years ago|reply
[+] [-] Osiris|6 years ago|reply
[+] [-] scarby2|6 years ago|reply
[+] [-] segmondy|6 years ago|reply
[+] [-] edf13|6 years ago|reply
Far too risky
[+] [-] bwann|6 years ago|reply
It's also rather handy when you want to run ad hoc queries on your machine, e.g. which kernels are out there, is this leftover rpm installed somewhere, etc.
At a past job we had an in-house tool like this that also logged all of the commands anyone had ever ran with it and saved the stdout/stderr output in a webui+cli output. If you suspected somebody did something clowny, you can go look at exactly what they did, when, and what the result was. This logging was very important, imho and tools like it should have it.
[+] [-] linuxdude314|6 years ago|reply
A company I worked at developed a custom SSH tool that would tie into your CMDB, kind of like a home rolled ansible you could use to blast commands to whole a PoP or even the whole network at the same time. The thing was insanely powerful, but using it was like wielding a machete through a delicate garden.
Ansible is really great in this regard since you have the ubiquity of SSH but sane management, not some difficult to maintain shell script.
At scale parallel ssh performs poorly in comparison to message queues. As mentioned elsewhere in this thread, SaltStack defaults to using ZMQ for communication which has much lower overhead than SSH (with trade offs as well).
[+] [-] tristor|6 years ago|reply
Of course you shouldn't be redneck deploying stuff with Cluster SSH when you could write an Ansible Playbook or deploy Chef/Puppet/Salt. But to troubleshoot issues and manage host/OS level functionality on multiple identical systems at the same time, it's invaluable.
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] seqizz|6 years ago|reply
[1] https://github.com/innogames/polysh
[+] [-] carlsborg|6 years ago|reply
https://github.com/carlsborg/dust#parallel-ssh-operations
[+] [-] johnklos|6 years ago|reply
[+] [-] rumanator|6 years ago|reply
I mean, I've seen plenty of people jumping to wild assumptions and invent all sorts of conspiracy theories just because they are faced with the fact that other people have different opinions and tastes, and for some reason that is impossible unless there's a vast conspiracy to push ideas that don't match their personal whims and tastes.
I've used Ansible in the past. Ansible sends python scripts over SSH to hosts and runs them remotely. Users specify the state they want the system to be in and the little python scripts run all the checks and apply all changes. Does it take a conspiracy to prefer this approach over simply multicasting SSH connections to multiple hosts?
[+] [-] lykr0n|6 years ago|reply
[+] [-] maaand|6 years ago|reply