[Sheepdog] On gateway sheep and running a sheepdog cluster
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Thu Dec 15 10:22:06 CET 2011
At Tue, 13 Dec 2011 14:11:34 +0000,
Chris Webb wrote:
>
> At the moment, I think that an IO operation from a failed disk will make the
> corresponding sheep call leave_cluster(), dropping into a gateway mode where
> it forwards IO operations for the qemu processes attached to it, but doesn't
> store data any more, and presumably isn't considered part of the cluster for
> the purposes of the consistent hash ring?
Yes, right.
>
> I wonder if it would make sense to be able to directly start a sheep daemon
> in this state, i.e. a gateway daemon which doesn't have an associated store
> directory.
>
> Nodes in sheepdog clusters will probably have multiple drives, and the
> natural thing to do with these is to run one sheep daemon per drive.
> (Running a single daemon on top of a RAID array is wasteful, as sheepdog
> does its own data replication.) For example, I've been testing on machines
> with 6*2TB drives each, running as sheep -p 700[0-5] -D against the
> filesystems.
>
> If the first disk dies, sheep -p 7000 leaves the cluster but continues
> forwarding for the local qemu processes. However, when I replace the disk, I
> can't kill and restart it on the new, clean filesystem because all the VMs
> will lose their block storage.
>
> However, if I could start a pure gateway sheep, I could run that on port
> 7000, and use 700[1-6] for data storage sheep, all of which are safe to kill
> and restart. Conversely, the gateway sheep doesn't have associated storage,
> so doesn't need to be restarted.
>
> This would also enable non-storage nodes to have resilient qemu processes
> running on them, connecting to a local gateway sheep which forwards to the
> storage nodes in the ring. This is a (presumably easier and mostly already
> working) alternative to implementing sheepdog failover support in qemu.
>
> Does this make sense?
It really makes sense, and would be a much better approach to remove
the gateway SPOF than implementing connection failover in the qemu
block driver!
I think it is not difficult to support a gateway mode in the sheep
command line. I'll implement it after releasing 0.3.0. :)
Thanks,
Kazutaka
More information about the sheepdog
mailing list