[Sheepdog] Sheepdog reliability

Thu Nov 18 15:48:01 CET 2010

On 11/18/2010 09:45 AM, MORITA Kazutaka wrote:
> Hi,
>
> At Wed, 17 Nov 2010 14:44:34 +0100,
> Dennis Jacobfeuerborn wrote:
>>
>> Hi,
>> I've been following Sheepdog for a while and now that patches are being
>> sent to include it in libvirt I want to start testing it. One question I
>> have is how I can ensure the reliability of the Sheepdog cluster as a
>> whole. Specifically I'm looking at two cases:
>>
>> Lets assume a setup with 4 nodes and a redundancy of 3.
>>
>> If one node fails what are the effects both for the cluster and the clients
>> (e.g. potential i/o delays, messages, etc.)
>
> Until Sheepdog starts a new round of membership, the cluster suspends
> any requests to data objects and the clients I/O is waited.  How long
> to wait is up to the value of totem/consensus in corosync.conf.  The
> default value is 1200 ms.  If you want to run Sheepdog with large
> number of nodes, the value need to be larger number and the delay time
> becomes larger.

Wouldn't it be better to decouple the client requests from these cluster 
timings? This looks like a unnecessary bottleneck that gets worse as the 
cluster gets larger. Why not let the client request have it's own timeout 
of say 1 second and if no response arrives retry the request to one of the 
nodes that carry one of the redundant copy of the blocks?
That way a node failure would have less of an impact on the applications 
and delays for the application request would become independent of the 
cluster size.

>> and what needs to be done once
>> the node is replaced to get the cluster back into a healthy state?
>
> All you need to do is only starting a sheep daemon again.  If it
> doesn't work, please let me know.

So when the node goes down will the cluster copy all of the lost blocks to 
another node automatically to re-establish the redundancy requirement of 3 
copies?

If the new node is added to the cluster will it stay empty or will the 
cluster rebalance the blocks according to some load criterium?

>>
>> What happens if *all* nodes fail due to e.g. a power outage? What needs to
>> be done to bring the cluster back up again?
>
> If no VM is running when all nodes fail, all you need to do is
> starting all sheep daemons again.  However, if I/O requests are
> processed when all nodes fail, Sheepdog needs to recover the objects
> whose replicas are in inconsistent states (and it is not implemented
> yet).
>

What is the timeframe for this implementation after all this has to be 
implemented before Sheepdog can go into productive use.

Regards,
   Dennis