automated installations
10 Feb 2006 The task to maintain about 70 Debian machines fell upon me when I accepted my current job.At this moment, we have 70 machines running Debian releases from Debian testing all the way to Debian slink (!). That includes the versions testing (no release I think), sarge (3.1), woody (3.0), potato (2.2) and slink (2.1).
Furthermore, noone ever seemed to care a lot about keeping things clean and easy to manage, so instead of making Debian packages, putting them in a central repository and installing from there, most software we use (including daemons, kernels, perl libraries, etc) are compiled from source.
To make things even worse, some software that exists as Debian packages, is installed from source. The reason for the latter is that, if they updated to the next Debian release, my colleagues were scared that the entire distribution might break. So they decided to just keep the distribution as it was and compile things from source.
I got sick of this situation quite rapidly. I have no problem patching a single kernelsource and compiling it with special options enabled. But when this compilation process is the same for all 70 machines, I see no reason why I should log into each machine and execute the same set of commands on them all (downloading, unpacking, patching, configuring, compiling and installing the software all by hand).
Instead, I created a script that does all the above and makes a nice clean Debian package that goes into a central repository. Then I just install the package on all servers. Problem solved !
Not quite.
Besides maintaining software, I also need to install new machines from time to time. The current process is as follows:
- boot the new machine from a CD
- Copy the entire contents of a master disk from another computer
- Go over all configuration files and change them
- Boot the new machine and hope everything works
This is a horrible way of working. After doing this a couple times I got sick of it as well.
Being the lazy guy that I am (or call it efficient if you will), I set out to standardize all our machines.
Every machine can be viewed as a collection of hardware and software. The hardware is abstracted by the kernel and the software runs on top of that kernel. So whatever hardware we have the software doesn't care. More about the kernel later.
Suppose we have 3 machines. A webserver, a logserver and a monitoringserver. They all run on hardware and have a kernel. On top of this kernel, we run Debian. Debian consists of a number of packages installed. For the webserver for example, we need (obviously) some webserver software (apache), an SSH server to admin the server remotely, some watchdog script to restart the webserver if it crashes, an editor to edit files with, ... etc. The logserver requires a logging service (syslog-ng for example), aswell as a SSH-server, a watchdog, an editor, etc. The same goes for the monitoring server.
Now it doesn't take much to realize that most setups are the same except for some small amount of packages. Instead of reinstalling each server from scratch and then installing all the needed packages, we will work with a default server install, a generic server if you will. This install will have an SSH-server, a watchdog, an editor, etc.
Now if you want to install a webserver, you need a default install + apache. For a logserver, a default install + syslog-ng, etc.
Much and much cleaner and easier to maintain: the setups are largely homogenous and the environment in which you have to work is the same everywhere. Once you install such a server, you need to make sure the separate installations stay in sync with eachother. This means that every time you need something on a certain machine, you should ask yourself if that package should go into the default install or not.
All of this can be achieved because of the package management system of this linux distribution. All machines will look alike for the most part. This is because we took the biggest common denominator and made that the standard. The only thing to worry about now is at the lower level: the kernel.
The 70 machines we have can be divided in a number of categories. We have Intel machines and AMD machines. Both are rackmountable, but we also have Intel machines that are towermodels.
Thus the serverpark consists of 3 machine types. The key differences are processor type and network card brand.
The kernel in itself has abstraction layers in itself. It is constructed for the most part, on platform independent code. All platform specific code is in assembler (which is a language to program with the most basic instructions of a processor). The drivers are the same for each processor type because they are written in a higher language ( C ).
Anyway, enough with the nerd-talk. Suffice it to say that we need 2 types of kernels for those 3 types of machines. Since we can include the drivers for all networkcards we're gonna use, we only have to worry about the processor type :)
[Actually, in our case we also make distinguish between machines who will need NAT functionality and those who don't. This is because of performance issues. So we actually have 4 kinds of kernels: those with intel or amd, and those with or without NAT]
I could go on and on about this, but I'm not writing a book about this (yet)
Let's skip to the rant shall we ?
To accomplish all the above, I am creating a special boot CD based on the Debian install CD. The idea is that you can put the CD in a new machine, boot it up, answer a few questions (like name of machine, IP address and what type of standard server it is going to be, e.g. webserver or logserver or ...), and then the CD would install everything automagically (using debian-installer preseeding).
The whole thing wouldn't take more than 15 minutes.
I've worked out most of the details.
Right now I'm working on the partitioning scheme of the harddisk. On all our servers, we have 3 partitions: a swap partition, a /boot partition and a root partition. All 3 are primary partitions (as opposed to extended/logical partitions)
I want to keep up with that way of doing things because it doesn't make sense to do it another way.
In order to automatically partition a disk, the Debian installer has a tool called partman-auto. You can tell it what you want your partitioning scheme to look like and it creates it for you.
At least, it is supposed to.
I can not get the damn thing to create 3 primary partitions. Furthermore, the documentation sucks because it is not complete. The code is largely undocumented. The usage is not intuitive (and doesn't follow the principle of least surprise)...
The rest of this text will assume that you know what I'm talking about, because I don't feel like explaining everything :)
What I want to do is create 3 partitions in this order:
sda1: 2GB swap,
sda2: 64MB /boot,
sda3: /
I tried using the following recipe:
# boot-root ::
# 2048 2048 200% linux-swap
# $primary{ }
# method{ swap } format{ }
# .
# 64 64 64 ext3
# $primary{ } $bootable{ }
# method{ format } format{ }
# use_filesystem{ } filesystem{ ext3 }
# mountpoint{ /boot }
# .
# 500 10000 1000000000 ext3
# $primary{ }
# method{ format } format{ }
# use_filesystem{ } filesystem{ ext3 }
# mountpoint{ / }
# .
Notice the $primary{} keyword on every partition. If I use it on all 3 partitions, partman-auto will complain that there are too many primary partitions. However, the partition table allows for 4 partitions so I should be ok.
I might be falsely assuming that $primary{} means: allocate a primary partition for this partition. Maybe it means something totally different, but since there is no decent documentation, I cannot tell. I've tried looking at the source of partman-auto (which is written in bash), but I can't make heads or tails of it. One interesting thing I noticed is that it doesn't seem to use any binary ?
This would mean that the whole partition thing is done from bash, and I assume through a /dev interface. This is really cool :)
Anyway, I would love it if someone could show me an example recipe for 3 partitions that I could reuse. I would also love a better partman-auto package that isn't so obscure and obfuscated [what the fuck is the priority value in the recipes ?? From what I can see from the whacko algorithm to calculate the partition sizes, it is used to calculate a factor for how much a partition is allowed to grow ? If you make the priority equal to the minimum size, then the calculated size will be exactly the minimum size. I can see a use for that, because I want to specify EXACTLY how big my partitions have to be. In any other case, what is the effect of the priority ?] and some more documentation with some example recipes.
I'm going to talk to the authors of this thing. I hope they can help. If not, I think it will probably be easier if I program a partitioner myself.