[sheepdog] [PATCH v2] recovery: avoid recovering object from node left

levin li levin108 at gmail.com
Thu May 24 08:00:48 CEST 2012


From: levin li <xingke.lwp at taobao.com>

v1 -- > v2:
rebased to the current master branch
------------------------------------------------ >8
In the recovery path, sheep may get to old epoch at which
some nodes have left the cluster, we shouldn't try to recover
objects from these nodes, so I add a check function to check
whether the target node is a valid node at current epoch.

Signed-off-by: levin li <xingke.lwp at taobao.com>
---
 sheep/recovery.c |   20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/sheep/recovery.c b/sheep/recovery.c
index f341fc6..9d1f010 100644
--- a/sheep/recovery.c
+++ b/sheep/recovery.c
@@ -193,6 +193,20 @@ static void rollback_old_cur(struct sd_vnode *old, int *old_nr, int *old_copies,
 	*old_copies = new_old_copies;
 }
 
+static int check_entry_valid(struct sd_vnode *entry, struct sd_node *nodes,
+				int nr_nodes)
+{
+	int i;
+
+	for (i = 0; i < nr_nodes; i++) {
+		if (!memcmp(entry->addr, nodes[i].addr, sizeof(entry->addr)) &&
+			entry->port == nodes[i].port)
+			return 0;
+	}
+
+	return -1;
+}
+
 /*
  * Recover the object from its track in epoch history. That is,
  * the routine will try to recovery it from the nodes it has stayed,
@@ -223,8 +237,10 @@ again:
 		int idx;
 		idx = obj_to_sheep(old, old_nr, oid, i);
 		tgt_entry = old + idx;
-		ret = recover_object_from_replica(oid, tgt_entry,
-						  epoch, tgt_epoch);
+		ret = check_entry_valid(tgt_entry, rw->cur_nodes, rw->cur_nr_nodes);
+		if (!ret)
+			ret = recover_object_from_replica(oid, tgt_entry,
+							  epoch, tgt_epoch);
 		if (ret == 0) {
 			/* Succeed */
 			break;
-- 
1.7.10




More information about the sheepdog mailing list