[sheepdog] [PATCH 1/2] sheep: get vdi bitmap from all the nodes

morita.kazutaka at gmail.com morita.kazutaka at gmail.com
Wed Jul 10 10:50:19 CEST 2013


From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>

Currently, we assume that the existing nodes has the complete vdi
bitmap and reading from one of them is enough.  However, this
assumption is wrong if the joining nodes have a vdi object which is
not in the running cluster.  For example,

 1. Sheepdog is running with one node A

 2. Two node B and C joins Sheepdog at the same time, and the node B
    has a vdi object which is not in the node A.

 3. If C calls get_vdi_from() against A before A does it against B, C
    cannot have the vdi object in its vdi bitmap.

The safe and simple approach to fix this problem is:

 - The newly joined node calls get_vdi_from() against all the existing
   nodes.
 - The existing node calls get_vdi_from() only against the newly
   joined node.

We can optimize it, but I think there is no simple way to do it.  I
left it as a TODO in the source code.

Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
---
 sheep/group.c |   31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/sheep/group.c b/sheep/group.c
index 370c625..2f52d0d 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -545,12 +545,13 @@ static void do_get_vdis(struct work *work)
 	int i, ret;
 
 	if (!node_is_local(&w->joined)) {
-		switch (sys->status) {
-		case SD_STATUS_OK:
-		case SD_STATUS_HALT:
-			get_vdis_from(&w->joined);
-			return;
-		}
+		sd_dprintf("try to get vdi bitmap from %s",
+			   node_to_str(&w->joined));
+		ret = get_vdis_from(&w->joined);
+		if (ret != SD_RES_SUCCESS)
+			sd_printf(SDOG_ALERT, "failed to get vdi bitmap from "
+				  "%s", node_to_str(&w->joined));
+		return;
 	}
 
 	for (i = 0; i < w->nr_members; i++) {
@@ -558,17 +559,21 @@ static void do_get_vdis(struct work *work)
 		if (node_is_local(&w->members[i]))
 			continue;
 
+		sd_dprintf("try to get vdi bitmap from %s",
+			   node_to_str(&w->members[i]));
 		ret = get_vdis_from(&w->members[i]);
-		if (ret != SD_RES_SUCCESS)
+		if (ret != SD_RES_SUCCESS) {
 			/* try to read from another node */
+			sd_printf(SDOG_ALERT, "failed to get vdi bitmap from "
+				  "%s", node_to_str(&w->members[i]));
 			continue;
+		}
 
 		/*
-		 * If a new comer try to join the running cluster, it only
-		 * need read one copy of bitmap from one of other members.
+		 * TODO: If the target node has a valid vdi bitmap (the node has
+		 * already called do_get_vdis against all the nodes), we can
+		 * exit this loop here.
 		 */
-		if (sys->status == SD_STATUS_WAIT_FOR_FORMAT)
-			break;
 	}
 }
 
@@ -728,6 +733,8 @@ static void update_cluster_info(const struct join_message *msg,
 	main_thread_set(current_vnode_info,
 			  alloc_vnode_info(nodes, nr_nodes));
 
+	get_vdis(nodes, nr_nodes, joined);
+
 	switch (msg->cluster_status) {
 	case SD_STATUS_OK:
 	case SD_STATUS_HALT:
@@ -746,8 +753,6 @@ static void update_cluster_info(const struct join_message *msg,
 			break;
 		}
 
-		get_vdis(nodes, nr_nodes, joined);
-
 		sys->status = msg->cluster_status;
 
 		if (msg->inc_epoch) {
-- 
1.7.9.5




More information about the sheepdog mailing list