Booting

From BitFolk
Jump to navigation Jump to search

All about how BitFolk VPSes are booted.

Booting isn't easy when you're paravirtual

All BitFolk VPSes are currently paravirtualised (PV). As a consequence they don't get to see a real disk nor do they have a pretend BIOS which loads a real bootloader.

pygrub

Up until mid-February 2017 we cheated and ran a thing called pygrub, which was a Python script that pretended to be GNU GRUB. It read your block devices and tried to read a GRUB-legacy config file. That's no longer in use for a variety of reasons so we'll say no more about it for now. If you are interested then read the "History" section below.

All hail pvgrub

These days every BitFolk VPS is booted via pvgrub2. This is the real GRUB2 binary compiled as a Xen PV guest kernel. This has a number of advantages:

  • It runs in a (your) virtual machine so there should be no way that you can compromise BitFolk's host machines even if you find some nasty exploit in a filesystem driver.
  • It uses an actual GRUB binary so supports everything that GRUB is supposed to support, with an interface that is more robust than pygrub.
  • It can chainload to another bootloader inside your VPS if you have exotic requirements.

There is one slight disadvantage:

  • BitFolk now needs to know whether your guest is 32- or 64-bit.

Xen decides whether your guest is 32- or 64-bit depending on what type of kernel image it is asked to boot. When pygrub is in use, your kernel is extracted out of your block device and so Xen can tell whether your guest is 32- or 64-bit directly. When using pvgrub2, the GRUB binary is your kernel, and it can only boot kernels of the same architecture as itself.

Setting your architecture

You've been able to set the architecture of your VPS in the Xen Shell for quite some time, using the arch command.

In the past though this command's only real use was to hint to installers which sections of distribution mirrors they should look at. It has been possible, through use of the Rescue VM, to install a different architecture without using the arch command and still have your VPS boot anyway. This will no longer be possible, so if your VPS no longer boots, checking the architecture is the first thing you should do.

Using the arch command just tells us which CPU architecture you'd like to boot, and which mirrors to use if/when you later execute one of the installers with the install command. It does not alter your existing VPS in any way. The only way to actually alter the CPU architecture of your VPS is to install an operating system of a different architecture (or cross-grade if your operating system supports that).

Boot process

<imgur thumb="yes" w="562">cufm6gd.gif</imgur> On the right you should see an animated GIF of a terminal session where a Debian stretch VPS is booted with a GRUB-legacy config file. The bootloader config is viewed and booted. Then grub-pc package is installed to convert the VPS to GRUB2. ttyrec or ttygif seem to have introduced some corruption and offset characters, but you probably get the idea.

If you're looking at your console in the Xen Shell then the first thing that you should see is a list of boot methods that BitFolk's GRUB recognises. It checks for each of the following things, in order, on each of your block devices and every partition of those block devices:

  1. Chainload to a bootloader at:
    • /boot/xen/pvboot-$ARCH.elf
    • /xen/pvboot-$ARCH.elf
  2. Load a GRUB config from:
    • /boot/grub/grub.cfg
    • /grub/grub.cfg
  3. Load a GRUB-legacy config from:
    • /boot/grub/menu.lst
    • /grub/menu.lst

If the relevant paths do not exist then the menu entry will not be present. It checks both /boot/blah… and /blah… so that you can have /boot mounted on a separate partition and still have it work.

You'll have 5 seconds to choose a menu entry, otherwise the top most one will be automatically selected. If it fails to boot then the next one down will be tried, and so on, until all are exhausted. This allows for automatic (eventual) boot as long as one of the methods will work.

Note that at any time you can use the usual GRUB key commands, e.g. Esc to go back, e to edit the configuration, c to open a GRUB command line, and so on. It is perfectly possible to boot your VPS with no actual bootloader configuration, by typing all the GRUB commands at the command line.

Each menu entry has a further 2 second pause to inform you of what is going on. If you're watching and impatient this can be skipped by pressing Esc. It will then carry on with the method you selected.

Chainloading

Chainloading to another bootloader is an advanced topic that most users will have no interest in, but it needs to be the first thing attempted because a typical chainloading setup would also have a /boot/grub/grub.cfg file present.

All the other methods involve your GRUB config being interpreted by BitFolk's GRUB binary. If for some reason you need for your own bootloader binary to be run then you can install it in the proper place within your VPS (e.g. /boot/xen/pvboot-x86_64.elf) and control will be passed to it. More about that later.

Loading a GRUB config

Loads a GRUB config from the specified path, which generally will result in a familiar-looking GRUB interface with whatever boot entries are in your own grub.cfg.

Loading a GRUB-legacy config

Loads a GRUB 0.x legacy-style config, the type of thing you are used to seeing in menu.lst files.

As of mid-February 2017 the vast majority of BitFolk VPSes have GRUB-legacy configurations, but we would expect these to dwindle as GRUB-legacy ceases to be packaged in modern Linux distributions.

Upgrading from GRUB-legacy to GRUB2

Almost every customer has a GRUB-legacy bootloader config in a menu.lst file. As long as that is working it is best kept. Since BitFolk's GRUB will check for grub.cfg before it checks for menu.lst there is no harm in leaving your GRUB-legacy configuration in place. It won't even slow the boot process down.

So, on Debian/Ubuntu, the process would be something like:

  1. # apt-get install grub-pc
    This will remove the grub-legacy package but leave menu.lst in place.
  2. Check that your /boot/grub/grub.cfg file looks sane. Maybe you need to edit /etc/default/grub a bit and run update-grub to get it how you like. Popular changes include removing the "quiet" boot option.
  3. # reboot
    Or go to the Xen Shell and shutdown or whatever.
  4. A new menu entry for /boot/grub/grub.cfg should have now appeared. Either select it or let it be selected automatically.
  5. If that doesn't work out, go back and select the /boot/grub/menu.lst option which should still be working.

After you're happy that the newer version of GRUB is working you can delete menu.lst.

Advanced stuff

Navigating the GRUB2 command line

GRUB knows your block devices as xen/something. There's tab completion so you can do:

grub> ls (xen/<tab>

and get a list of all block devices GRUB has seen, and then something like:

grub> ls (xen/xvda,msdos1)/boot/

will give you a list of all the files in the /boot directory of the filesystem on the first MSDOS partition on the disk device called xvda.

More about chainloading

As far as we're aware, Debian is the only Linux distribution to have packaged a PV-enabled bootloader. Maybe also derivatives like Ubuntu.

In your VPS, if you install the grub-xen package you'll find some image files now in /boot/xen/. That's the GRUB binary as a PV kernel, and it's set to read your /boot/grub/grub.cfg. Your distribution scripts will keep that config up to date, so you would edit and update them in the normal way (update-grub). The difference will be that it is your own GRUB binary that parses your configuration.

Most users will have no reason to do this.

History

Are we sitting comfortably? Then I will begin…

Ancient history

Kernels and initrds had to live outside of the guest in dom0. Users couldn't update their own kernel but had to find some means to sync their kernel and initrd to some place in the dom0 (the privileged guest that controls everything else). People came up with various hacks to allow users to do that, the disadvantage being that quite a lot of that had to run in dom0.

Still rather a long time ago

Someone invented pygrub, a Python implementation of something that looked a bit like GNU GRUB 0.x. It was capable of looking inside an image file, block device or partitions on same, to try to find a file called menu.lst.

If it found that file it would try to parse it like GRUB would, display a menu emulating GRUB's menu, and pass the chosen kernel, initrd and boot arguments to Xen.

Advantages

  • Users could install their own kernel packages, since these would maintain a menu.lst file in their guest, and pygrub would then pick up the changes when they next booted.
  • Users could control their own kernel command line since this also was in the menu.lst file.

Disadvantages

  • It still all runs in dom0. pygrub is opening an actual filesystem that is supplied by the user, in the context of a user in dom0, and that is not a particularly safe thing to be doing.
  • It isn't actual GRUB, it's just trying to emulate it. That means it doesn't behave quite the same and doesn't support every single thing that is valid in a real GRUB menu.lst file.
  • As a further consequence of the above, pygrub is relying on userspace filesystem access libraries that aren't part of actual GRUB. This sometimes led to surprising discrepancies between what GRUB was expected to support and what pygrub would support, such as XFS filesystems, XZ-compressed initrds, etc.
  • GRUB 0.x is now referred to as GRUB-legacy. GRUB moved on to 1.x, it got a lot more complicated, and it now puts its configuration in /boot/grub/grub.cfg. Many Linux distributions gave up on GRUB-legacy and only support grub.cfg now, so users of such are left to maintain their own menu.lst file.
    On Debian and Ubuntu you can still install the package grub-legacy and it will maintain menu.lst, but it's getting increasingly persistent about wanting you to move to grub.cfg.

This is the boot method that BitFolk has almost always been using. Up until a couple of months ago there was still one customer who for various reasons had to have a kernel hard-coded in their dom0 config file, but now there's none. As of mid-February everyone was on pygrub.

Recent-ish history

Around 2010 in light of the above disadvantages, a member of the Xen project ported GRUB 0.x to boot as a paravirtual guest under Xen. That's the same sort of thing that your VPSes are. So you get a VM started, and the VM runs actual GRUB. Once GRUB knows what it wants to boot, it chain loads to that.

This was known as "pvgrub", an awful name really since it differs from pygrub in only the descender of one letter. Should have passed that one by the marketing department first.

Advantages

  • Runs only inside a VM.
  • Still allows user control.

Disadvantages

  • Part of the Xen code tree, not really packaged anywhere.
  • Still only GRUB-legacy support.

Due to it still only supporting GRUB-legacy, which wasn't any different from what pygrub supports, BitFolk did not pursue it.

We ran into a couple of sticky situations such as when XZ-compressed kernels became popular in Debian and pygrub didn't support that, but we worked around it.

You may recall, we modified pygrub to detect an XZ-compressed kernel and unpack it using the actual xz utility, until a newer version of pygrub could be installed which supported those kernels.

Just 3¼ years ago!

A GNU GRUB committer added PV-booting support to upstream GRUB 2.

What this meant is that using only the upstream GRUB 2 binaries we could generate a GRUB-as-kernel image that could be booted as Xen PV virtual machine, which could then do everything that GRUB 2 can normally do. Namely look inside your block devices for GRUB configs and boot your own kernels.

It took a couple of years for this to filter down to being supported in Debian stable and to iron out some bugs, so, fast forward to…

Now

As of mid-February 2017 everyone has been switched to GRUB2-as-PV-kernel booting.

Frequently Asked Questions

Are GPT partitions supported?

Probably. It has not been exhaustively tested.

In theory, BitFolk's GRUB is configured to search for the bootloader config files on every partition of every block device. In this case "partition" certainly means MSDOS ones, but GRUB can also see GPT and UFS partitions too, so those should also work.

GPT partitions are known to be bootable on Debian 8.x (jessie), so they probably also work on all later versions of Debian and Ubuntu. When creating the disklabel in the partitioning tool you will need to have changed the Debconf priority to "low" otherwise it will just assume that you want an MSDOS partition table and not offer you a choice.

Can I boot from a filesystem directly on a block device, i.e. no partition table?

Yes. BitFolk's GRUB searches for configuration files directly on all disk devices as well as on any partitions they may have.