[sheepdog] [PATCH v2 1/2] sheep: get vdi bitmap from all the nodes

MORITA Kazutaka morita.kazutaka at gmail.com
Thu Jul 11 06:28:26 CEST 2013


From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>

Currently, we assume that the existing nodes has the complete vdi
bitmap and reading from one of them is enough.  However, this
assumption is wrong if the joining nodes have a vdi object which is
not in the running cluster.  For example,

 1. Sheepdog is running with one node A

 2. Two node B and C joins Sheepdog at the same time, and the node B
    has a vdi object which is not in the node A.

 3. If C calls get_vdi_from() against A before A does it against B, C
    cannot have the vdi object in its vdi bitmap.

The safe and simple approach to fix this problem is:

 - The newly joined node calls get_vdi_from() against all the existing
   nodes.
 - The existing node calls get_vdi_from() only against the newly
   joined node.

We can optimize it, but I think there is no simple way to do it.  I
left it as a TODO in the source code.

This patch also updates tests/functional/052.out.  The newly joined
node can have a complete vdi bitmap even before sheepdog starts up, so
sheep can run object recovery correctly when we run 'collie cluster
recover force'.

Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
---
 sheep/group.c            |   31 ++++++++++++++++++-------------
 tests/functional/052.out |    4 ----
 2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/sheep/group.c b/sheep/group.c
index 6f159b4..2030c44 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -541,12 +541,13 @@ static void do_get_vdis(struct work *work)
 	int i, ret;
 
 	if (!node_is_local(&w->joined)) {
-		switch (sys->status) {
-		case SD_STATUS_OK:
-		case SD_STATUS_HALT:
-			get_vdis_from(&w->joined);
-			return;
-		}
+		sd_dprintf("try to get vdi bitmap from %s",
+			   node_to_str(&w->joined));
+		ret = get_vdis_from(&w->joined);
+		if (ret != SD_RES_SUCCESS)
+			sd_printf(SDOG_ALERT, "failed to get vdi bitmap from "
+				  "%s", node_to_str(&w->joined));
+		return;
 	}
 
 	for (i = 0; i < w->nr_members; i++) {
@@ -554,17 +555,21 @@ static void do_get_vdis(struct work *work)
 		if (node_is_local(&w->members[i]))
 			continue;
 
+		sd_dprintf("try to get vdi bitmap from %s",
+			   node_to_str(&w->members[i]));
 		ret = get_vdis_from(&w->members[i]);
-		if (ret != SD_RES_SUCCESS)
+		if (ret != SD_RES_SUCCESS) {
 			/* try to read from another node */
+			sd_printf(SDOG_ALERT, "failed to get vdi bitmap from "
+				  "%s", node_to_str(&w->members[i]));
 			continue;
+		}
 
 		/*
-		 * If a new comer try to join the running cluster, it only
-		 * need read one copy of bitmap from one of other members.
+		 * TODO: If the target node has a valid vdi bitmap (the node has
+		 * already called do_get_vdis against all the nodes), we can
+		 * exit this loop here.
 		 */
-		if (sys->status == SD_STATUS_WAIT_FOR_FORMAT)
-			break;
 	}
 }
 
@@ -724,6 +729,8 @@ static void update_cluster_info(const struct join_message *msg,
 	main_thread_set(current_vnode_info,
 			  alloc_vnode_info(nodes, nr_nodes));
 
+	get_vdis(nodes, nr_nodes, joined);
+
 	switch (msg->cluster_status) {
 	case SD_STATUS_OK:
 	case SD_STATUS_HALT:
@@ -742,8 +749,6 @@ static void update_cluster_info(const struct join_message *msg,
 			break;
 		}
 
-		get_vdis(nodes, nr_nodes, joined);
-
 		sys->status = msg->cluster_status;
 
 		if (msg->inc_epoch) {
diff --git a/tests/functional/052.out b/tests/functional/052.out
index c02ba08..da34b8e 100644
--- a/tests/functional/052.out
+++ b/tests/functional/052.out
@@ -46,10 +46,6 @@ The cluster may need to be force recovered if:
 
 Are you sure you want to continue? [yes/no]: 
 finish check&repair test
-fixed missing 7c2b2500000001
-fixed missing 7c2b2500000002
-fixed missing 7c2b2500000003
-fixed missing 7c2b2500000004
 Cluster status: running, auto-recovery enabled
 
 Cluster created at DATE
-- 
1.7.9.5




More information about the sheepdog mailing list