_      _            _
         _ __ ___ (_) ___| | __  _ __ | |
        | '_ ` _ \| |/ _ \ |/ / | '_ \| |
        | | | | | | |  __/   < _| | | | |
        |_| |_| |_| |\___|_|\_(_)_| |_|_|
                  | |
                  |_|       Thoughts on (technical) stuff...
You're from: 23.22.76.170

Currently we are building a fairly rock solid high availability cluster for a client. This has the "usual" ingredients: two locations, two NetApps, two clusters of three vmware ESX servers and a bunch of virtual machines running on top of the ESX servers. Also included in the mix is a VDI (now called View) virtual desktop infrastructure for running virtual windows XP clients.

This is all managed by SRM (site recovery manager) and it is almost working. But that is another story.

What got me thinking is the following.

Last week I did a consultancy job where they had a build a fail over cluster using DRBD. With DRBD you have a disk device /dev/drbd/0 which is transparently replicated. The device file can be used like any other, fdisk, mkfs and mount all work as expected.

Now throw KVM into the mix...

The virtual machine images must be stored on the DRBD device. Suppose we have two servers called master and slave. On master the kvm processes run. In a failover situation the following needs to happen:

  • If master is stil available, kill all kvm processes;
  • if master is still available, set the DRBD device in secondary mode or disable it all together;
  • On slave make the DRBD device primary (so that it will become available in rw mode. If you don't do this you get Wrong medium type errors;
  • On slave start the kvm processes again.

It would even be cooler if the virtual machines could actually be copied over while still are running, but I don't know if that would be possible.

Shared storage would be possible by letting one virtual machine export (via NFS/SaMBa/iSCSI) another DRBD device.

So my site recovery manager script (SRM script) will be something along the lines of this:

#!/bin/bash

# when doing a fail over call it on the old site
# (if still available): srm stop
# the other side call it like: srm start

case $1 in
stop)
    /etc/init.d/kvm stop
     drbdadm /dev/drbd/0 secondary

;;

start)
     drbdadm /dev/drbd/0 primary -o
    /etc/init.d/kvm start

;;
esac
exit 0

Is it really that simple?

Posted in: linux

5 comments

Is seems logical to assume you will be deploying more than just one server-group on master and slave. Then you have to rethink your setup. I would think rather in cluster1 and 2 than in master and slave. Upon cluster1 and 2 I would define some resource-servers.

This setup will be more complex, but it will also give you al lot of flexibility.
More than one kvm servers does not really change this setup. All kvm wants is a file. So you only have to manage multiple files
in the drdb0 disk.

Fancy stuff, like restarting the servers in a particular order, can all be handled in the kvm start/stop script.
You ask 'is it that simple?'... maybe there are situations where problems arise. Can it be that the two servers are in a state in which they produce corruption. When there are 2 masters because the two servers can't communicate correctly and each server promotes itself to master? Maybe you need a check-file on drdb0 for that situation. Novell talks about splitbrain in their cluster-software.

It's just a thought...
You should use DRDB in master-slave mode (or whatever it is called). In that setup the slave can not write to the DRBD; KVM will need a writeable file just to startup, so on the slave KVM can not even run.

Ofcourse having the two sides both in slave mode is still a problem. A split slave situation? ;-)
This severely limits your options... DRDB can also be used in master-master mode. In that way you use your hardware more efficient. But I can imagine that the master-slave situation will suffice for your needs.

Comments are closed

If you really, really want to comment, please mail miek@miek.nl.

0 comments in moderator queue