[sheepdog] [PATCH v7 3/7] sheep: rejoin cluster after a zookeeper session timeout
Hitoshi Mitake
mitake.hitoshi at gmail.com
Wed Jun 26 03:28:43 CEST 2013
At Wed, 26 Jun 2013 09:12:06 +0800,
Kai Zhang wrote:
>
> [1 <text/plain; us-ascii (quoted-printable)>]
>
> On Jun 25, 2013, at 11:06 PM, Hitoshi Mitake <mitake.hitoshi at gmail.com> wrote:
>
> > As you say, the rejoin would be an only way to handle session timeout
> > correctly. But the current zookeeper driver produces serious problems
> > when network failures happen (e.g. inconsistent epochs).
> >
> > So I believe the panic() or exit() would be better than doing
> > nothing. If sheeps with zookeeper driver exits immediately in the
> > above case, we can restart sheeps manually.
> > # I understand this solution goes against the policy of sheepdog... :(
> >
>
> I see. Do you mean a separate patch based on upstream? or based on
> PATCH 1/7 and 2/7?
>
> Because these patches have been reviewed by Kazutaka and Yuan,
> I think they will be merged soon after some minor modifications.
I think the 1 - 5 would be a good individual patchset.
>
> Would you mind that we merge the whole series to the stable branch later?
>
Of course. Your zookeeper improvement is a very important thing for
safe operation of sheepdog.
> > And our internal team needs the solution until this Thursday (we have
> > a local change for this problem but it is a temporal and dirty
> > thing). If you can help us, I'm very happy :)
>
> Our team is also waiting for this patch for a long time :)
:)
Thanks,
Hitoshi
More information about the sheepdog
mailing list