[Sheepdog] [PATCH V2 2/2] sheep: teach sheepdog to better recovery the cluster

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Fri Sep 23 14:54:29 CEST 2011


At Thu, 22 Sep 2011 14:34:17 +0800,
Liu Yuan wrote:
> 
> On 09/22/2011 02:01 PM, MORITA Kazutaka wrote:
> > At Wed, 21 Sep 2011 14:59:26 +0800,
> >
> > After that, we get the consistent epoch like the follows.
> >
> >    Creation time        Epoch Nodes
> >    2011-09-22 14:18:33      6 [10.68.14.1:7000, 10.68.14.1:7001, 10.68.14.1:7002]
> >    2011-09-22 14:18:33      5 [10.68.14.1:7000, 10.68.14.1:7001]
> >    2011-09-22 14:18:33      4 [10.68.14.1:7000]
> >    2011-09-22 14:18:33      3 [10.68.14.1:7002]
> >    2011-09-22 14:18:33      2 [10.68.14.1:7001, 10.68.14.1:7002]
> >    2011-09-22 14:18:33      1 [10.68.14.1:7000, 10.68.14.1:7001, 10.68.14.1:7002]
> >
> > In this case, Sheepdog discards all the objects which were stored
> > before epoch 4.  It is because there is no overlap between epoch 3 and
> > 4, and Sheepdog cannot handle this situation now.
> >
> > I think this can be fixed with a small change.  I'll dig into this
> > issue.
> >
> >
> > Thanks,
> >
> > Kazutaka
> Hi Kazutaka,
>      I also noticed the objects discarded by sheepdog after the similar 
> situation, but I have no idea of it for now. would you please elaborate 
> a bit more detailed reason for this specified situation?

In recovery phase, Sheepdog recovers objects from the previous epoch
to the current epoch.  If the target object is not found in the
previous epoch, Sheepdog searches the objects from the two epoch ago.
And Sheepdog goes back to the older epoch again until it finds the
target objects.

In the above situation, when the target objects are not found in epoch
6, 5, and 4, Sheepdog searches the objects from epoch 3.  However, the
epoch 3 is only stored in [10.68.14.1:7002], so [10.68.14.1:7000] and
[10.68.14.1:7001] don't know which node is included in epoch 3.
Similarly, [10.68.14.1:7002] doesn't have epoch 4, so all the sheep
daemons cannot go back epoch from 4 to 3.

I think the solution would be simple; we only have to support getting
epoch information from remote.


Thanks,

Kazutaka



More information about the sheepdog mailing list