From: Yunkai Zhang <qiushu.zyk at taobao.com> When a sheep joins into the cluster, master will call sd_check_join_cb() to get the join_result which will be update to ev.buf, and all sheep will receive this update. If join_result equals to CJ_RES_MASTER_TRANSFER, master will kill itself by exit(). Zookeeper needs at most SESSION_TIMEOUT to detect master's leaving action, it's better to call zk_leave() explicitly before master exit. On the other hand, other sheeps will continue to process the updated JOIN EVENT. But now, Sheepdog assumes that only one sheep(the joining sheep) is alive in MASTER_TRANSFER scenario, this can simplify processing logic(maybe we will overthrow this assumption in the future for other corner-case). Based on this assumption, the joining sheep just need to reset its member_list(saved in node_btree in zookeeper driver), make it only contains itself. Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com> --- sheep/cluster/zookeeper.c | 15 +++++++-------- 1 files changed, 7 insertions(+), 8 deletions(-) diff --git a/sheep/cluster/zookeeper.c b/sheep/cluster/zookeeper.c index 3378535..24cdcf0 100644 --- a/sheep/cluster/zookeeper.c +++ b/sheep/cluster/zookeeper.c @@ -775,6 +775,7 @@ static int zk_dispatch(void) if (res == CJ_RES_MASTER_TRANSFER) { eprintf("failed to join sheepdog cluster: " "please retry when master is up\n"); + zk_leave(); exit(1); } } else @@ -800,15 +801,13 @@ static int zk_dispatch(void) if (node_eq(&ev.sender.node, &this_node.node)) zk_member_init(zhandle); - if (ev.join_result == CJ_RES_MASTER_TRANSFER) { - /* FIXME: This code is tricky, but Sheepdog assumes that - * nr_nodes = 1 when join_result = MASTER_TRANSFER... */ + if (ev.join_result == CJ_RES_MASTER_TRANSFER) + /* + * Sheepdog assumes that only one sheep(master will kill + * itself) is alive in MASTER_TRANSFER scenario. So only + * the joining sheep will run into here. + */ node_btree_clear(&zk_node_btroot); - node_btree_add(&zk_node_btroot, &this_node); - - zk_queue_push_back(zhandle, &ev); - zk_queue_pop(zhandle, &ev); - } node_btree_add(&zk_node_btroot, &ev.sender); dprintf("one sheep joined[down], nr_nodes:%ld, sender:%s, joined:%d\n", -- 1.7.7.6 |