I think I’m finally fully recovered. Whew. Last Saturday afternoon I got an email from the power company saying, oh hey, we’re turning the power off tonight, sorry about that. Such a thing is always a possibility but still never welcome. Since I’m pretty serious about avoiding interruptions of computer service, it is something I try have a plan for. But on this occasion I had a little problem.
First let’s introduce my main personal server, raven.
Raven lives on my desk and is on 24/7 where it does all of my important things. Email, cvsroot, most programming, blog stuff, help notes, and general writing all happen on this server. It doesn’t have a monitor or a keyboard. Instead, I log into it over SSH from wherever I happen to be. It is my own personal cloud.
One of the important features of this machine is a RAID1 disk pair which means that everything that is ever written to disk is actually written to two disks simultaneously. In the event a disk fails, the other carries on with no loss of service. I’ve been using this setup since 2005 and have written about it before.
This machine also runs a Xen VM hypervisor and the real host of interest is a Xen VM. This means that if something causes problems with the VM, it’s easy to restore a backup image of it and try again. And when I say "something" I mean Gentoo update. Oh ya, it’s all Gentoo, of course, because I’m insane. Anyway, many months ago, I updated the hypervisor and I watched with some apprehension as GRUB got switched out for GRUB2. Normally, that kind of thing in the computer world is progress, but GRUB2 is a huge pain in the ass and plagued with challenges. So much so, that it is to me, basically unusable. Thanks to this updated bootloader, I wrote the following note to myself.
There was some emerge
shenanigans recently and GRUB seems to have
been changed from legacy to GRUB2. I don’t know why GRUB2 wasn’t used
to begin with but it looks like I attempted it and then gave up. Maybe
I’ll find out why.
I created a new /boot/grub/grub.cfg (none of that makefile bullshit)
and I ran grub-install
which I think is pointing to GRUB2 these
days. Hard to tell really.
I anticipate there will be reboot problems. Raven is up and fine now, but if a reboot happens, it may be very hard to restart because the hypervisor won’t want to boot. This must be tested.
All screens on raven will have to be shut down. After the attempted reboot fails, a monitor and perhaps a keyboard will have to be arranged for xedxen and troubleshooting can begin. Chances are good that sysresc will be needed.
And guess what? After 10 hours of no power I discovered everything in that note came true. Since you don’t care about this nearly as much as I do, you’re asking, what’s the point here? Who cares?
This situation got me thinking about my server design and I came up with some modifications.
-
There is no escape from GRUB2. It simply must be used. This is problematic because, ironically, doing complicated things with it is much harder. The reason for that is that they worked very hard to make it much easier to do complicated things but they completely failed. Now instead of a well-documented configuration file that can be configured, there are distro specific automagical configuration systems that "compile" into the configuration file which, if done this way, is not human readable. My solution - use GRUB2 in the simplest way possible.
-
I want RAID1 and GRUB2 maybe can do this, maybe not. Don’t worry about it. Don’t let GRUB2 even try. Make GRUB2 boot the simplest thing possible and let that system figure out anything more complicated.
-
I feel the same way about initramfs. From my notes on the topic.
I like to have my kernel do what is needed because it is set up exactly the way it should be. And in 2005, this worked for something like mounting a simple Linux system that just happened to be protected by a kernel managed software RAID1 double drive setup. But as time wore on, the metadata for the RAID volumes changed and although the new bootloader understood it, the kernel, weirdly, did not. No one seemed to care that these formerly robust systems were now broken. This was because *every*body used an initramfs so that the kernel wouldn’t have to do its job. I fought this and tried to do it the Right Way, but eventually I just ran into too many brick walls.
But with the simplest boot setup possible, one does not need an initramfs. And this is good. I’m also weird in that I do not load modules. Any of them. If I need kernel functionality, I compile it in. Normal off-the-rack clothes are not better than bespoke tailored ones just becuase the former are popular.
-
Magnifying the complexity considerably is the Xen hypervisor. Since I felt I needed this setup to be a VM, that’s what needed to happen. But it turns out that this is fine. In fact it’s a perfectly reasonable answer to all of the above problems.
The solution to all the problems is to take the preboot complexity of GRUB and initramfs and the right kernel modules and a RAID1 setup and exchange it for a Xen hypervisor (which I was already committed to anyway). The idea is that instead of running all these very complex things (GRUB2, initramfs) to provide for what will be needed on the complicated running server, just boot a hypervisor in the simplest way possible. Then assemble your RAID1 disk set and select the right kernel with the right stuff and fire up the VM.
The only change from my previous system is that the hypervisor is not also on the RAID1 system. (And now finally the point!) Here’s the thing, it doesn’t matter! If I do everything with my VM, and nothing with my hypervisor, why would I ever need RAID1 with that? In fact, it’s better to not have that be mirrored in real time.
On my hardware (you can kind of see it), I have 2 small micro SSDs and 2 normal SSDs. The former were the RAID set for the hypervisor and the latter were for my real server. But I realized, why not just set up one of the micro SSDs to boot the hypervisor just fine and then copy that setup to the other one. It wouldn’t be live, but if I messed up the live hypervisor (as I did with the GRUB clobbering update), I could actually just boot up the reserve one. Since no value gets added to the hypervisor drive once its all up and running, there’s no reason for RAID1. And that cures all of the problems.
Now once the hypervisor is happily running with absolutely no
complicating elements, I can put together the important RAID1. Then I
just use that raid set for the important server. Here’s my Xen
configuration showing the multi-disk device md127
being mapped to
the VM’s /dev/sda1
.
disk = ['phy:/dev/md127,sda1,w'] # <-- raven
Let’s not ask why it’s 127 and not 0. I’m not happy about it, but I’ve learned to just accept that madness.