[sheepdog-users] Understanding number of copies and offline node behaviour

Liu Yuan namei.unix at gmail.com
Fri Sep 6 16:32:10 CEST 2013

On Fri, Sep 06, 2013 at 06:32:26AM +0200, Gerald Richter - ECOS wrote:
> Hi,
> I am trying to setup a new sheepdog cluster. I am using version 0.6.0.
> I have currently 2 host for the start, but the cluster should grow to 4 and more nodes.
> Since I cannot change the number of copies later on (as far as I can tell from studying posts from the list), I tried to format the cluster with 3 copies. This causes “IO has halted as there are too few living nodes”. So I reformat with copies = 2 and everything works fine, but if one node goes down I again get “IO has halted as there are too few living nodes”.

I don't think you need to reformat the cluster. 'collie cluster format -c number'
this number is the default copy number for a newly created vdi, but you can
actually set whatever value you want for each individual vdi.

E.g, you have a cluster formatting with '-c 2' with 4 nodes, then you can

$ collie vdi create new -c 3 # 3 copies for this new or
$ collie vdi create new -c 4 # 4 copies for this new.

> My expectation was that the number of copies specifies how many times the same block will be saved to a different node (or zone). So if I specify 3 copies, it will still work if maximum 2 nodes goes down in my cluster, because there is always a third node to pick up the data.
> I case of my two node test, I would have expected to continue to work, because if I have 2 copies and 2 nodes, there needs to be all data on both nodes, so if one of them fails the data is still on the other node.
> I would be very happy if somebody can enlighten me how the number of copies and the number of online nodes are related.

Let me start with a example, you have a 5 nodes cluster with 3 copies as default

If you have a vdi with 3 copies, it means sheepdog will store 3 copies
distributed on all the nodes because sheepdog stripes the vdi, so actually this
vdi's data is distributed on all the nodes.

If any node goes donw, sheepdog will automatically do the replica recovery. Now
you have 4 nodes, after recovery, you still 3 copies distributed evenly on these
4 nodes.

If you later add more nodes in, sheepdog will also try to rebalance the data
onto the newly added nodes automatically. Suppose at some point, you have 8
nodes. So you gain much more data reliability with the number increase. 2 nodes
go down for some reason at the same time, sheepdog will try to recover the lost
data on these 2 nodes from other alive nodes and after recovery,
you again have 3 copies.

For some reason you have 3 nodes go down at the same time, which means some data
will be lost and sheepdog can't recovery them from alive nodes. At this point, 
if you have one failed node go back, sheepdog will try to recover the lost data
from this node, and after recovery, you again will have 3 copies.

All the recovery and rebalance are done automatically without any manual

To put it simple, data in the sheepdog cluster are auto-healing. With more nodes
added in, you just aggregate more space and power linearly into the cluster.


More information about the sheepdog-users mailing list