[Sheepdog] [PATCH v2] sheep: fix a network partition issue
zituan at taobao.com
zituan at taobao.com
Tue Oct 25 08:55:37 CEST 2011
From: Yibin Shen <zituan at taobao.com>
In some situation, sheep may disconnected from corosync instantaneously,
at the same time, both sheep and corosync will keep running but
none of them exit, then the disconnected sheep may receive a confchg
message from corosync which notify this sheep has left.
that will lead to a network partition, this patch fix it.
Signed-off-by: Yibin Shen <zituan at taobao.com>
---
sheep/group.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/sheep/group.c b/sheep/group.c
index e22dabc..ab5a9f0 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -1467,6 +1467,9 @@ static void sd_leave_handler(struct sheepdog_node_list_entry *left,
struct work_leave *w = NULL;
int i, size;
+ if (node_cmp(left, &sys->this_node) == 0)
+ panic("BUG: this node can't be on the left list\n");
+
dprintf("leave %s\n", node_to_str(left));
for (i = 0; i < nr_members; i++)
dprintf("[%x] %s\n", i, node_to_str(members + i));
--
1.7.7
More information about the sheepdog
mailing list