[Sheepdog] On gateway sheep and running a sheepdog cluster

Tue Dec 13 15:11:34 CET 2011

At the moment, I think that an IO operation from a failed disk will make the
corresponding sheep call leave_cluster(), dropping into a gateway mode where
it forwards IO operations for the qemu processes attached to it, but doesn't
store data any more, and presumably isn't considered part of the cluster for
the purposes of the consistent hash ring?

I wonder if it would make sense to be able to directly start a sheep daemon
in this state, i.e. a gateway daemon which doesn't have an associated store
directory.

Nodes in sheepdog clusters will probably have multiple drives, and the
natural thing to do with these is to run one sheep daemon per drive.
(Running a single daemon on top of a RAID array is wasteful, as sheepdog
does its own data replication.) For example, I've been testing on machines
with 6*2TB drives each, running as sheep -p 700[0-5] -D against the
filesystems.

If the first disk dies, sheep -p 7000 leaves the cluster but continues
forwarding for the local qemu processes. However, when I replace the disk, I
can't kill and restart it on the new, clean filesystem because all the VMs
will lose their block storage.

However, if I could start a pure gateway sheep, I could run that on port
7000, and use 700[1-6] for data storage sheep, all of which are safe to kill
and restart. Conversely, the gateway sheep doesn't have associated storage,
so doesn't need to be restarted.

This would also enable non-storage nodes to have resilient qemu processes
running on them, connecting to a local gateway sheep which forwards to the
storage nodes in the ring. This is a (presumably easier and mostly already
working) alternative to implementing sheepdog failover support in qemu.

Does this make sense?

Cheers,

Chris.