We are talking about system administration and system engineering for Linux. Most of this also works more or less the same on Unix, but this is today much less relevant. It is even possible to do some of these things on MS-Windows, but that is another story… So just assume for the time being OS=Linux. Or wait for the next article in two weeks, if that is not relevant for you….
In the good old days system administration meant logging into the server and doing something there. Of course good system engineers wrote scripts in Bash, Ruby, Perl or Python and got things done more efficiently. When we were maintaining 1000 vending machines of a public transport company, we wrote scripts that ran on one Linux-machine in our office and iterated through a whole list of machines to perform a task on them. Usually one, then another one, then five or so and at some point the rest.
It is not totally gone, it is still necessary to log into the machine to do things that are not yet automated, that are not worth while automating or that simply did not work as desired. Or to check if things did work as desired.
But with tools like ansible it is now possible to do what was done with these scripts in the old days. Ideally the idea is to describe the end state, for example a certain:
– We want a certain user to be present. If it is present, nothing is done, if not, the user is created.
– We want a certain file system to be mounted on a certain mount point. Same idea….
– We want a certain package to be installed. If it is missing, it is installed.
The files that describe the desired outcome are in a directory tree and this is called a playbook. We can think of it as a high level scripting langauge, just a bit like SQL, where we also describe the outcome and not how to get there. And as always, there are ways to fall back to python and bash where the built in features are not sufficient.
Ideally, ansible playbooks, as they are called, are idempotent. That means, they can be exucuted as many times as we want without creating harm. That makes it much easier to update 1000 hosts. We can just try the update on 1, 2, 5, 10, 50 and 100 and then on all and it does not matter, if we actually just give the whole list in the last step or if we by mistake have the same host multiple times. It does matter a bit, because the whole thing becomes slower, the more hosts we have in the list. But still it is no big risk of messing things terribly up.
But this depends on how the playbooks are written. So the goal of writing idempotent playbooks needs to be taken care of the achieve this. In cases where standard ansible features are used, this is usually the case. But when the playbooks are using our own extensions or calling python or bash scripts, we are on our own and need to keep an eye on this or deal with the fact that the playbook is not idempotent.
Now usually there is a file called inventory, that contains „all“ hosts. This needs to be provided. Often the hosts in the file are put into groups and certain steps apply only to one group. Now the actual installation can and usually should be limited to a subset of this „all“, for example when we are installing on a test system and not on production servers. It is possible to limit this to a group, a single host, a short list of hosts or to the hosts found in another file (not the inventory). With some simple scripts it is for example possible, to run the playbook on the next n hosts that have not been processed so far or to split up work with a colleague.
And the playbooks should of course be managed by a source code management system, for example git.