[Sheepdog] [PATCH V2 2/2] sheep: teach sheepdog to better recovery the cluster
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Thu Sep 22 08:01:47 CEST 2011
At Wed, 21 Sep 2011 14:59:26 +0800,
Liu Yuan wrote:
>
> Kazutaka,
> I guess this patch addresses inconsistency problem you mentioned.
> other comments are addressed too.
Thanks, this solves the inconsistency problem in a nice way! I've
applied 3 patches in the v3 patchset.
There is still a problem we need to solve. For example:
$ for i in 0 1 2; do sheep /store/$i -z $i -p 700$i; sleep 1; done
$ collie cluster format
$ for i in 0 1 2; do pkill -f "sheep /store/$i"; sleep 1; done
$ for i in 0 1 2; do ./sheep/sheep /store/$i -z $i -p 700$i; sleep 1; done
$ for i in 1 2; do ./sheep/sheep /store/$i -z $i -p 700$i; sleep 1; done
After that, we get the consistent epoch like the follows.
Creation time Epoch Nodes
2011-09-22 14:18:33 6 [10.68.14.1:7000, 10.68.14.1:7001, 10.68.14.1:7002]
2011-09-22 14:18:33 5 [10.68.14.1:7000, 10.68.14.1:7001]
2011-09-22 14:18:33 4 [10.68.14.1:7000]
2011-09-22 14:18:33 3 [10.68.14.1:7002]
2011-09-22 14:18:33 2 [10.68.14.1:7001, 10.68.14.1:7002]
2011-09-22 14:18:33 1 [10.68.14.1:7000, 10.68.14.1:7001, 10.68.14.1:7002]
In this case, Sheepdog discards all the objects which were stored
before epoch 4. It is because there is no overlap between epoch 3 and
4, and Sheepdog cannot handle this situation now.
I think this can be fixed with a small change. I'll dig into this
issue.
Thanks,
Kazutaka
More information about the sheepdog
mailing list