[sheepdog] [PATCH] sheep: let all sheeps with smaller epoch added into delayed_nodes list
Yunkai Zhang
yunkai.me at gmail.com
Sat Jul 28 21:42:45 CEST 2012
From: Yunkai Zhang <qiushu.zyk at taobao.com>
Since sheeps in delayed_nodes list won't cause recovery when the cluster in
WAIT_FOR_JOIN state, it's safe to put sheeps with smaller epoch into
delayed_nodes list, regardless of whether it once belonged to the cluster.
Benifit from this change, we needn't to restart sheep in the following scenario:
1) Start [0,1,2,3] sheeps:
epoch of [0,1,2,3] sheeps = 1
2) Kill [0] sheep, and then Shutdown [1,2,3] sheeps
epoch of [0] sheep = 1
epoch of [1,2,3] sheeps = 2
3) Start [1,2] sheeps:
epoch of [0] sheep = 1
epoch of [1,2,3] sheeps = 2
cluster status = WAIT_FOR_JOIN (waits [3] sheep)
4) Start [0] sheep:
[0] sheep will be added into delayed_nodes list, needn't to restart
epoch of [0] sheep = 1
epoch of [1,2,3] sheeps = 2
cluster status = WAIT_FOR_JOIN (waits [3] sheep)
5) Start [3] sheep:
epoch of [0,1,2,3] sheeps = 3
cluster status = OK
Now cluster start working...
Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com>
---
sheep/group.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/sheep/group.c b/sheep/group.c
index f7c8ca7..bac371b 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -423,7 +423,6 @@ static bool add_delayed_node(uint32_t epoch, struct sd_node *node)
if (find_entry_list(node, &sys->delayed_nodes))
return false;
- assert(!find_entry_epoch(node, epoch));
n = xmalloc(sizeof(*n));
n->ent = *node;
@@ -558,10 +557,6 @@ static int cluster_wait_for_join_check(struct sd_node *joined,
eprintf("joining node epoch too small: %"
PRIu32 " vs %" PRIu32 "\n",
jm->epoch, local_epoch);
-
- if (bsearch(joined, local_entries, nr_local_entries,
- sizeof(struct sd_node), node_id_cmp))
- return CJ_RES_FAIL;
return CJ_RES_JOIN_LATER;
}
--
1.7.11.2
More information about the sheepdog
mailing list