[sheepdog] [PATCH] recovery: avoid recovering object from node left
levin li
levin108 at gmail.com
Wed May 23 10:51:10 CEST 2012
From: levin li <xingke.lwp at taobao.com>
In the recovery path, sheep may get to old epoch at which
some nodes have left the cluster, we shouldn't try to recover
objects from these nodes, so I add a check function to check
whether the target node is a valid node at current epoch.
Signed-off-by: levin li <xingke.lwp at taobao.com>
---
sheep/recovery.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/sheep/recovery.c b/sheep/recovery.c
index afe58ac..491e876 100644
--- a/sheep/recovery.c
+++ b/sheep/recovery.c
@@ -302,6 +302,20 @@ static void rollback_old_cur(struct sd_vnode *old, int *old_nr, int *old_copies,
*old_copies = new_old_copies;
}
+static int check_entry_valid(struct sd_vnode *entry, struct sd_node *nodes,
+ int nr_nodes)
+{
+ int i;
+
+ for (i = 0; i < nr_nodes; i++) {
+ if (!memcmp(entry->addr, nodes[i].addr, sizeof(entry->addr)) &&
+ entry->port == nodes[i].port)
+ return 0;
+ }
+
+ return -1;
+}
+
/*
* Recover the object from its track in epoch history. That is,
* the routine will try to recovery it from the nodes it has stayed,
@@ -345,7 +359,9 @@ again:
}
tgt_entry = old + tgt_idx;
- ret = recover_object_from_replica(oid, tgt_entry, epoch, tgt_epoch);
+ ret = check_entry_valid(tgt_entry, rw->cur_nodes, rw->cur_nr_nodes);
+ if (!ret)
+ ret = recover_object_from_replica(oid, tgt_entry, epoch, tgt_epoch);
if (ret < 0) {
struct sd_vnode *new_old;
int new_old_nr, new_old_copies;
--
1.7.10
More information about the sheepdog
mailing list