At Tue, 25 Oct 2011 15:06:05 +0800, Liu Yuan wrote: > > On 10/25/2011 02:55 PM, zituan at taobao.com wrote: > > > From: Yibin Shen <zituan at taobao.com> > > > > In some situation, sheep may disconnected from corosync instantaneously, > > at the same time, both sheep and corosync will keep running but > > none of them exit, then the disconnected sheep may receive a confchg > > message from corosync which notify this sheep has left. > > that will lead to a network partition, this patch fix it. > > > > Signed-off-by: Yibin Shen <zituan at taobao.com> > > --- > > sheep/group.c | 3 +++ > > 1 files changed, 3 insertions(+), 0 deletions(-) > > > > diff --git a/sheep/group.c b/sheep/group.c > > index e22dabc..ab5a9f0 100644 > > --- a/sheep/group.c > > +++ b/sheep/group.c > > @@ -1467,6 +1467,9 @@ static void sd_leave_handler(struct sheepdog_node_list_entry *left, > > struct work_leave *w = NULL; > > int i, size; > > > > + if (node_cmp(left, &sys->this_node) == 0) > > + panic("BUG: this node can't be on the left list\n"); > > + > > > Hmm, the panic output looks confusing. how about "Network Patition Bug: > I should have exited.\n"? since the output will be seen by > administrators, not only programmer. Applied after modifying output text, thanks! Kazutaka |