[Sheepdog] [PATCH] fix a bug in recovery which makes sheep get an incomplete epoch node list

Li Wenpeng levin108 at gmail.com
Wed Apr 11 08:50:07 CEST 2012


From: levin li <xingke.lwp at taobao.com>

In do_recover_object(), when recovery fail at some epoch, we need to go
back to the previous epoch, but get_vnodes_from_epoch() gives an incomplete
node list for the specified epoch, so the target node from which we will
recover the object is wrong sometime, in that case, recovery always fails,
the length we used to read epoch file was shorter than expected, now it's fixed.

Signed-off-by: levin li <xingke.lwp at taobao.com>
---
 sheep/store.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/sheep/store.c b/sheep/store.c
index 5d82e5a..739862c 100644
--- a/sheep/store.c
+++ b/sheep/store.c
@@ -916,7 +916,7 @@ int epoch_log_read_remote(uint32_t epoch, char *buf, int len)
 	char host[128];
 	struct sd_node nodes[SD_MAX_NODES];
 
-	nr = epoch_log_read(le, (char *)nodes, ARRAY_SIZE(nodes));
+	nr = epoch_log_read(le, (char *)nodes, sizeof(nodes));
 	nr /= sizeof(nodes[0]);
 	for (i = 0; i < nr; i++) {
 		if (is_myself(nodes[i].addr, nodes[i].port))
@@ -1286,9 +1286,9 @@ static void *get_vnodes_from_epoch(int epoch, int *nr, int *copies)
 	struct sd_node nodes[SD_MAX_NODES];
 	void *buf = xmalloc(len);
 
-	nodes_nr = epoch_log_read_nr(epoch, (void *)nodes, ARRAY_SIZE(nodes));
+	nodes_nr = epoch_log_read_nr(epoch, (void *)nodes, sizeof(nodes));
 	if (nodes_nr < 0) {
-		nodes_nr = epoch_log_read_remote(epoch, (void *)nodes, ARRAY_SIZE(nodes));
+		nodes_nr = epoch_log_read_remote(epoch, (void *)nodes, sizeof(nodes));
 		if (nodes_nr == 0) {
 			free(buf);
 			return NULL;
-- 
1.7.1




More information about the sheepdog mailing list