Suspend and restore

From BitFolk
Jump to: navigation, search

What suspend and restore is and how it can be used with BitFolk VPSes.

Definition

Normally when a BitFolk host machine is subject to maintenance that requires a reboot, all VPSes on the machine will first be shut down. Once the maintenance is complete and the host is up again, all VPSes will be booted again.

It is possible instead to suspend a running VPS to permanent storage. Upon boot, all suspended VPSes will be restored from permanent storage before the remainder of the VPSes are booted.

This is very much like the hibernation feature which you may be familiar with when using desktop Linux. When the VPS is restored, everything that was running before should be running again.

Advantages

As the VPS basically just leaves off from where it was before, this is faster and less disruptive. The state of the VPS should be the same as it was before, except that the clock will have stood still for the elapsed time and very likely all TCP connections will have been torn down.

Possible issues

Restore sometimes does not work correctly. The failure mode is quite deterministic in that if it works it should always work but if it fails it will fail every time. At the moment it is difficult to predict whether it will work or not, so the only way to tell is to try it.

Sometimes the failure is in the kernel; in other cases it is an application which objects to time standing still for a long period.

Therefore the default is to not use suspend/restore. More than 20 of BitFolk's own infrastructure VMs do use it however, so it is successful more often than not.

Since this feature was offered around 25 BitFolk customer VPSes have been suspended and restored multiple times and there has been a 100% success rate.

How to use

You can enable suspension in the runtime preferences section of the Panel.

As these are early days for the suspend feature, if you enable it then BitFolk will add a Nagios ping monitor to your VPS so that BitFolk can tell if restore was successful next time there is a host reboot. Since failures tend to be immediate, this will alert BitFolk to the problem and allow for a speedy manual reboot to minimise your downtime.

Limitations

The suspend preference currently only matters for occasions when BitFolk is performing scheduled maintenance on the host machine. Suspend will not take place if there is for some reason an unexpected reboot or crash of the host machine.

You also won't be able to trigger a suspend/restore yourself - although it's unclear why you would want to do this.