[sheepdog-users] Cluster recover after loosing 2 devices

Andrew J. Hobbs ajhobbs at desu.edu
Tue Jun 17 15:21:42 CEST 2014


1)  Changing this behavior would result in any disk failure resulting in a cluster rebalance, which is not the behavior you want as the cluster scales up.  I suppose, in the event of an out of disk error, a cluster reweight and balance would be preferable.  I'd almost prefer the client fail with loud errors in the log and drop from cluster if it encounters an out of space issue.  This still would leave you vulnerable to data loss in the event of a second failure during recovery.

2) Actually, with the default behavior being unsafe, youshould have survived and remained up if you were using -c 3 even in the event of two failures.  I've encountered that very state during a network partition.  But running -c 2, you were effectively running without any redundancy during the initial failure.  When the unfortunate second failure occurred, any blocks not already finished recovery were lost.  Your problem was exacerbated by having mismatched disk sizes on the hosts resulting in a full situation.

This is one of those corner cases that is probably best noted as not best practice (said as I set up a cluster using -c 2) and note the potential issue.  You had both a double copy and mismatched drives insufficient to hold the contents of that node normally, coupled with a double failure.

That said, it may be preferable to have a check in place along the lines of: if internal recovery would result in exhausted free space (going over 80% in use maybe) on a node, fail the node rather than just detach the volume from sheepdog md.


On 06/17/2014 03:55 AM, Valerio Pachera wrote:
1) when a disk get unplugged the weight of the node doesn't change.
That may lead to disk full on that node.
I don't know if in later sheepdog version the dog get disconnected from the cluster in such case, or it leads to unclear state (the node still in the cluster but unable to issue write requests)

2) The unlucky case of having more devices to break down in the same period on different hosts.
with redundancy -c 2 I may loose a single host or a single disk (on a single node).
with -c 2:2 I may loose 2 hosts or 2 disks on 2 different hosts.
If loose 3 hosts, the cluster halt itself, waiting for the missing node to be back up.
If 3 disks break down in the same time period (on different hosts), the cluster should also halt it self, or do something to keep the cluster consistency (waiting for a manual operation).

Thank you.

Valerio.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: ajhobbs.vcf
Type: text/x-vcard
Size: 353 bytes
Desc: ajhobbs.vcf
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140617/ecd370c6/attachment-0005.vcf>


More information about the sheepdog-users mailing list