[sheepdog-users] Stop cluster when missing node == n copies
Liu Yuan
namei.unix at gmail.com
Tue Sep 10 17:07:01 CEST 2013
On Tue, Sep 10, 2013 at 04:49:56PM +0200, Valerio Pachera wrote:
> Here's an example: 3 node, copies 2.
> We know, in such configuration, only 1 node can die per time.
> A second node, could die, but not before the end of the recovery.
>
> The question is, what happens when 2 nodes dies at the same time?
>
>
> dog node list
> Id Host:Port V-Nodes Zone
> 0 192.168.2.44:7000 214 738371776
> 1 192.168.2.45:7000 51 755148992
> 2 192.168.2.47:7000 119 788703424
>
> dog node kill 2
> dog node kill 1
>
> dog cluster info
> Cluster status: running, auto-recovery enabled
>
> Cluster created at Tue Sep 3 10:29:42 2013
>
> Epoch Time Version
> 2013-09-10 16:04:38 5 [192.168.2.44:7000]
> 2013-09-10 16:04:37 4 [192.168.2.44:7000, 192.168.2.45:7000]
> 2013-09-04 15:30:03 3 [192.168.2.44:7000, 192.168.2.45:7000,
> 192.168.2.47:7000]
> 2013-09-04 13:40:13 2 [192.168.2.44:7000, 192.168.2.47:7000]
> 2013-09-03 10:29:38 1 [192.168.2.44:7000, 192.168.2.45:7000,
> 192.168.2.47:7000]
>
> dog node recovery
> Nodes In Recovery:
> Id Host:Port V-Nodes Zone Progress
>
> dog node list
> Id Host:Port V-Nodes Zone
> 0 192.168.2.44:7000 128 738371776
>
>
> I see there's no recovery running.
from who this node can recovery objects? There is only one node left. Please try
this, 4 nodes with 2 copies and kill 2 nodes at the same time.
> Does that mean the cluster is halted? (As I would expect to)
The cluster will never be halted if the number of storage nodes >= 1.
>
> In such case, may it be a good idea to change cluster status description?
>
> The above example applies also with a larger cluster, e.g. 9 nodes,
> copies 3, and the loss of 3 nodes at the same time.
3 nodes fail at the same time and if one of them can get back after failure then
you don't lose any data.
Thanks
Yuan
More information about the sheepdog-users
mailing list