[sheepdog-users] Cluster recover after loosing 2 devices

Valerio Pachera sirio81 at gmail.com
Tue Jun 17 18:07:17 CEST 2014

2014-06-17 16:23 GMT+02:00 Liu Yuan <namei.unix at gmail.com>:

> - if disks are unplugged by io error, we should reweight automatically.

Yes, this has to be done.

> - if disks are plugged/unplugged by users, we don't do auto-reweight.

I also have another idea about it, but I'll discuss it on a new topic.

> Try to pass '--strict' option to sheep daemon. It tell the cluster to stop
> the
> service if nodes number is less than required redundancy policy.

I may try but I don't think it's going to work because, as you say,
"...if node number is less..."
"...if lost devices number is equal or greater than redundancy policy".

Keep my cluster with -c 2 as an example.
If 2 hosts were going down at the same time...what was going to happen?
(Option --strict was not used)
If 2 devices were going down at the same time (as it happened)...does
sheepdog react in the same way?
In both cases we don't have enough objects.

Generalizing, the problem is not solved using -c 3, it's just more unlike
to happen.

Obviously, it doesn't matter if I loose 1,2,3 or all devices on a single
host at the same time.
It's very different if it happens on more hosts.

(Sorry if I repeat the same concept in different ways).
