[sheepdog-users] Stop cluster when missing node == n copies

Valerio Pachera sirio81 at gmail.com
Tue Sep 10 16:49:56 CEST 2013


Here's an example: 3 node, copies 2.
We know, in such configuration, only 1 node can die per time.
A second node, could die, but not before the end of the recovery.

The question is, what happens when 2 nodes dies at the same time?


dog node list
  Id   Host:Port         V-Nodes       Zone
   0   192.168.2.44:7000        214  738371776
   1   192.168.2.45:7000        51  755148992
   2   192.168.2.47:7000        119  788703424

dog node kill 2
dog node kill 1

dog cluster info
Cluster status: running, auto-recovery enabled

Cluster created at Tue Sep  3 10:29:42 2013

Epoch Time           Version
2013-09-10 16:04:38      5 [192.168.2.44:7000]
2013-09-10 16:04:37      4 [192.168.2.44:7000, 192.168.2.45:7000]
2013-09-04 15:30:03      3 [192.168.2.44:7000, 192.168.2.45:7000,
192.168.2.47:7000]
2013-09-04 13:40:13      2 [192.168.2.44:7000, 192.168.2.47:7000]
2013-09-03 10:29:38      1 [192.168.2.44:7000, 192.168.2.45:7000,
192.168.2.47:7000]

dog node recovery
Nodes In Recovery:
  Id   Host:Port         V-Nodes       Zone       Progress

dog node list
  Id   Host:Port         V-Nodes       Zone
   0   192.168.2.44:7000        128  738371776


I see there's no recovery running.
Does that mean the cluster is halted? (As I would expect to)

In such case, may it be a good idea to change cluster status description?

The above example applies also with a larger cluster, e.g. 9 nodes,
copies 3, and the loss of 3 nodes at the same time.



More information about the sheepdog-users mailing list