[Sheepdog] [PATCH v2] sheep: fix a network partition issue

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Mon Oct 31 15:11:26 CET 2011


At Mon, 31 Oct 2011 21:44:34 +0800,
Chaos Eternal wrote:
> 
> IMHO,
> 
> In case of cluster parttioning happened, we should introduce some
> STONITH techniques to avoid data corruption.
> generally, the step is as following:
> 1. partitioning detected
> 2. wait some interval to confirm the partition
> 3. vote to STONITH

I'm not familiar with STONITH.  What is exactly done in the "vote"
phase?

Thanks,

Kazutaka

> 
> STONITH can be truly shut the node down, or just seize the running of
> those nodes. we can discuss further.
> 
> 
> 
> On Mon, Oct 31, 2011 at 8:36 PM, Liu Yuan <namei.unix at gmail.com> wrote:
> > On 10/31/2011 07:49 PM, MORITA Kazutaka wrote:
> >
> >> At Mon, 31 Oct 2011 19:37:51 +0800,
> >> Liu Yuan wrote:
> >>>
> >>> On 10/31/2011 07:10 PM, MORITA Kazutaka wrote:
> >>>
> >>>> At Mon, 31 Oct 2011 18:15:06 +0800,
> >>>> Liu Yuan wrote:
> >>>>>
> >>>>> On 10/31/2011 06:00 PM, MORITA Kazutaka wrote:
> >>>>>
> >>>>>> At Tue, 25 Oct 2011 15:06:05 +0800,
> >>>>>> Liu Yuan wrote:
> >>>>>>>
> >>>>>>> On 10/25/2011 02:55 PM, zituan at taobao.com wrote:
> >>>>>>>
> >>>>>>>> From: Yibin Shen <zituan at taobao.com>
> >>>>>>>>
> >>>>>>>> In some situation, sheep may disconnected from corosync instantaneously,
> >>>>>>>> at the same time, both sheep and corosync will keep running but
> >>>>>>>> none of them exit, then the disconnected sheep may receive a confchg
> >>>>>>>> message from corosync which notify this sheep has left.
> >>>>>>>> that will lead to a network partition, this patch fix it.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Yibin Shen <zituan at taobao.com>
> >>>>>>>> ---
> >>>>>>>>  sheep/group.c |    3 +++
> >>>>>>>>  1 files changed, 3 insertions(+), 0 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/sheep/group.c b/sheep/group.c
> >>>>>>>> index e22dabc..ab5a9f0 100644
> >>>>>>>> --- a/sheep/group.c
> >>>>>>>> +++ b/sheep/group.c
> >>>>>>>> @@ -1467,6 +1467,9 @@ static void sd_leave_handler(struct sheepdog_node_list_entry *left,
> >>>>>>>>         struct work_leave *w = NULL;
> >>>>>>>>         int i, size;
> >>>>>>>>
> >>>>>>>> +       if (node_cmp(left, &sys->this_node) == 0)
> >>>>>>>> +               panic("BUG: this node can't be on the left list\n");
> >>>>>>>> +
> >>>>>>>
> >>>>>>>
> >>>>>>> Hmm, the panic output looks confusing. how about "Network Patition Bug:
> >>>>>>> I should have exited.\n"? since the output will be seen by
> >>>>>>> administrators, not only programmer.
> >>>>>>
> >>>>>> Applied after modifying output text, thanks!
> >>>>>>
> >>>>>> Kazutaka
> >>>>>
> >>>>>
> >>>>> Kazutaka,
> >>>>>    Maybe we should not panic out when it becomes a single node cluster.
> >>>>> The node will change into HALT state which doesn't any harm to its data.
> >>>>
> >>>> It is much better.  Currently, Sheepdog kills a minority cluster in
> >>>> __sd_leave() when network partition occurs because it is the simplest
> >>>> solution to keep data consistency.
> >>>>
> >>>> But this looks a different issue from this patch.  Does corosync
> >>>> include local node in the left list when network partition occurs?  If
> >>>> so, we should handle it in the corosync cluster driver because it
> >>>> looks a corosync specific issue to me.
> >>>>
> >>>
> >>> I am not sure, but if corosync include local node in the left list, it
> >>> should be a bug in corosync.
> >>>
> >>> let's assume (a,b,c) three nodes.
> >>> I am suspecting that that left message is for n(b,c), but after n(a)
> >>> rejoins, for whatever reason, the message is being broadcasting, and
> >>> n(a) just gets it wrongly.
> >>
> >> IIUC, n(a) should receive the left massage of n(b,c).
> >
> > Yes, but n(a) should not receive the leave_message(a) which is intended
> > for n(b,c).
> >
> > so the correct sequence should be:
> >
> > network partition happens
> > lm(a) -> n(b,c), lm(b,c) -> n(a).
> > then n(a) rejoins.
> > jm(a) -> n(a,b,c)
> >
> > anyway, I am not sure, cause I didn't look at the log.
> >
> > Thanks,
> > Yuan
> > --
> > sheepdog mailing list
> > sheepdog at lists.wpkg.org
> > http://lists.wpkg.org/mailman/listinfo/sheepdog
> >
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog



More information about the sheepdog mailing list