Running a Large Number of Servers

These days we often have to run a large number of servers, and the times where we could afford to manually log into each one to do system administration tasks are mostly over.

It turns out that there are always different approaches to deal with this. In most cases we are talking about virtual hosts, so we have a layer between via the visualization that can help us. We can have a number of master images and create virtual images from those even on demand in a matter of a minute. In case of MS-Windows it is an issue that they have some internal UUID as host-id which should be unique and which is heavily spread throughout the image, but this issue can be ignored if we do not worry about windows domains. Usually we do and I leave dealing with these issues to MS-Windows-experts.

Talking about Linux, we only need to make sure that the network interface is unique, which it is if we use hardware and do not mess around with it, but it is not necessarily if we use visualization and virtual network devices. This issue needs to be addressed, but it is well supported by common visualization tools. Another point is the host name. This is not too hard, because we only need to change it in one or two places, which can easily be done by a script. We can mount the image and do the change. Now the image can contain a start-up script that discovers on boot that it is a fresh copy and uses its host-name to retrieve further setup from some server. And we just have to maintain there which host has which setup. These can be automated to a very high extent. Then we can for example request a certain number of servers with certain software and configuration via an web interface. This creates new host-names, stores the setup with these host-names in its setup table, creates the virtual images, deploys them on any available hardware server and once they have stared they retrieve their setup from the server. We can also have master images that already contain certain predefined setups so that this second step is only needed for minor adjustments. We have to assume that these exist. Yes, this is called cloud technology.

If we keep the data somewhere else, these servers can be discarded and new ones can be created, so there is no need to do too complex stuff on them. Of course we want to run our software on them. So the day long procedure to install our software is not attractive any more. We need mechanisms that can be automated.

Running real hardware is a bit more demanding and for larger servers that might even be justifiable, because they do a lot of work for us. Quite often it is possible to actually do mechanisms quite similar to the virtual world even on real hardware. It is possible to boot the machine from an USB-stick which copies fetches an image and copies it on disk. Again only the host-name needs to be provided and then the rest can be automated. Another approach is to initially boot via the network, which is an option that most of us rarely use, but which is supported by the hardware. For running a large server farm such a hardware and bios setting can just be initially the default and from there machines can install and reconfigure themselves. In this case we probably need to use the Ethernet-address of the network device as a key to our setup table and we need to know what Ethernet addresses are in use. It is a big deal to set up such an environment, but once it is running, it is tremendously efficient. Homogenous hardware is of course essential, maybe an small number of hardware setups, but not a new model with each delivery. It is not enough that the new hardware is named the same as the old one, it needs to be able to run the same images without manual customization. It is possible to have a small number of images, but having to supply already different images for different server setups multiplying there number with the number of the hardware setups can grow out of control, if one of the numbers or both become too large.

Now we also have ways to actually access oure servers. there have been tools to run a shell just simultaneously on n hosts to do exactly the same at once. This is fine if they are exactly the same, but this is something we need to enforce or we need to accept that servers deviate. There are tools around to deal with these issues, but it is actually quite reasonable to do a script based approach. What we do is using ssh-key-exchange to make sure that we can log into the servers from some admin server without password. We can then define a subset of the set of our servers, which can be one, a couple, a large fraction, all or all with a few exceptions, for example. Then we distribute a script with scp to all the target machines in a loop. We run this on each target machine using ssh and parse the outputs to see which have been successful and which not. Here it is usually a good idea to have a farm of test servers to try this out first and then start on a small number of servers before running it on all of them.

The big bang philosophy of applying a change twice a year on the whole server landscape is not really a good idea here, because we can loose all our servers if we make a mistake and this can be hard to recover, although still have the same tools and scripts even for that, unless we really screw things up. So in these scenarios software that supports the interfaces of the previous version for its communication partners is useful, because it allows to do a smooth migration.

Just to give you a few hints: During some coffee break I suggested that Google has around a million servers. Even though there is no hard evidence for this, because this number seems to be confidential and only known to Google employees, I would say that this is a reasonable number. For sure they cannot afford a million system administrators. The whole processes needs to be very stream-lined. Or take the hosting provider where this site is running on. It is possible to have virtual web-hosts, in this case it is multiple sites running on the same virtual or physical machine sharing the same Apache instance with just different directories attached to different URL-patterns. This is available for very little money, again suggesting that they are tremendously efficient.

Share Button

Brass Music and what we can learn for IT

The English term „music“ refers to what we actually listen to, but also to how we write it down on paper, like this:

Music handwritten by Johann Sebastian Bach

Music handwritten by Johann Sebastian Bach


This musical notation is actually like a programming language, because it allows to write down complex musical pieces on paper.

But there is of course more to it than just mechanically playing what is on the paper and dealing with the inconveniences of the musical instruments. What makes it pleasant is the interpretation and that requires skill and intuition and experience and feelings. Since this is not a music blog, I will leave this as it is and stick with the relatively irrelevant side issue of the musical notation language, how we write music.

Generally there might be issues that it is hard to read, because things look too similar, but on the other hand musicians just see it immediately and at least fast enough to work efficiently with it, so I guess that this way of writing music is generally OK.

Now we have the possibility to cover a certain range, slightly more than two octaves, efficiently. Beyond that it will get hard to count the auxiliary lines. To cover different instruments, at least three kinds of Clefs are in use and the same note usually means the same. I think that there are ways to shift the whole system by one ocatve, at least for beginners, but usually with the three clefs that is not necessary for the whole piece of music.

Now for some brass instruments we have different sizes, as for other instruments as well. So the same way to play it yields a different tone on different sizes of the instrument. Just take the recorder, which has five common sizes. They are based on different f and c notes and when you play an f-based recorder you have to adopt to this by playing an f when reading an f-note, for example by closing all wholes. On a c-based recorder you read a c (the deepest you can regularly play) and close all holes to play this. Normally people know this and can deal with this. For brass instruments a different approach was chosen. For situations where you actually want to hear an F and might actually write an F in the notes for one size of the instrument, the larger or smaller instruments just call something an „F“ which is actually not an F at all for the rest of the musical world. So for these instruments „F“ does not mean the tone that you hear, but the grip combination that you do to achieve the tone, simplified. It was supposedly meant to make it easier for relatively unskilled musicians to adapt to different sizes of their instruments, but now even professionals have to live with this.

So they invented a new mechanism to simplify things, which in the overall view makes things a lot more complicated and simplifies just something trivial that even average skilled musicians can easily learn.

I respect the musicians for what they do and I guess since they can deal with this irregularity, it is kind of OK. Or at least up to the musicians to decide if they want to fix this or not.

But we can learn a lot for IT solutions from it.

We often have the situation that we need to adapt a software for another related, but slightly different use case. And we often get the request to simplify things.

It is important to think carefully at which level we do the adaption, so that it will make sense in the long run.

And we should simplify things, but there is no point in trying to make things simpler than they actually are, this simply cannot work an will backfire.

Share Button