[sheepdog] [PATCH 2/2] zookeeper: fix master transfer logic

Yunkai Zhang yunkai.me at gmail.com
Fri May 18 12:28:39 CEST 2012


From: Yunkai Zhang <qiushu.zyk at taobao.com>

When a sheep joins into the cluster, master will call
sd_check_join_cb() to get the join_result which will be
update to ev.buf, and all sheep will receive this update.

If join_result equals to CJ_RES_MASTER_TRANSFER, master will
kill itself by exit(). Zookeeper needs at most SESSION_TIMEOUT
to detect master's leaving action, it's better to call zk_leave()
explicitly before master exit. On the other hand, other sheeps
will continue to process the updated JOIN EVENT.

But now, Sheepdog assumes that only one sheep(the joining sheep)
is alive in MASTER_TRANSFER scenario, this can simplify processing
logic(maybe we will overthrow this assumption in the future for other
corner-case).

Based on this assumption, the joining sheep just need to reset
its member_list(saved in node_btree in zookeeper driver), make
it only contains itself.

Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com>
---
 sheep/cluster/zookeeper.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/sheep/cluster/zookeeper.c b/sheep/cluster/zookeeper.c
index 3378535..24cdcf0 100644
--- a/sheep/cluster/zookeeper.c
+++ b/sheep/cluster/zookeeper.c
@@ -775,6 +775,7 @@ static int zk_dispatch(void)
 				if (res == CJ_RES_MASTER_TRANSFER) {
 					eprintf("failed to join sheepdog cluster: "
 						"please retry when master is up\n");
+					zk_leave();
 					exit(1);
 				}
 			} else
@@ -800,15 +801,13 @@ static int zk_dispatch(void)
 		if (node_eq(&ev.sender.node, &this_node.node))
 			zk_member_init(zhandle);
 
-		if (ev.join_result == CJ_RES_MASTER_TRANSFER) {
-			/* FIXME: This code is tricky, but Sheepdog assumes that
-			 * nr_nodes = 1 when join_result = MASTER_TRANSFER... */
+		if (ev.join_result == CJ_RES_MASTER_TRANSFER)
+			/*
+			 * Sheepdog assumes that only one sheep(master will kill
+			 * itself) is alive in MASTER_TRANSFER scenario. So only
+			 * the joining sheep will run into here.
+			 */
 			node_btree_clear(&zk_node_btroot);
-			node_btree_add(&zk_node_btroot, &this_node);
-
-			zk_queue_push_back(zhandle, &ev);
-			zk_queue_pop(zhandle, &ev);
-		}
 
 		node_btree_add(&zk_node_btroot, &ev.sender);
 		dprintf("one sheep joined[down], nr_nodes:%ld, sender:%s, joined:%d\n",
-- 
1.7.7.6




More information about the sheepdog mailing list