[sheepdog] [PATCH] sheep: fix a dead-lock bug in screen_object_list()

levin li levin108 at gmail.com
Mon Nov 26 06:48:50 CET 2012

From: levin li <xingke.lwp at taobao.com>

The bug is introduced in the patch 97ccd87ea15e606b6ec9fecb54f5de453f9c5c1f
which causes the cluster hangs in recovery, this patch is like a revert patch
except that I add comment to explain why we should not ignore the objects
whose nr_objs are zero in screen_object_list().

screen_object_list() is called in recovery, if a VDI creation is in progress,
then the VDI creation operation is blocked by gateway to wait for completion of
recovery, and if we make screen_object_list() to retry infinitely to wait the
nr_objs to be not zero, which also means it waits for VDI creation to complete,
so, both VDI creation and recovery can not finish, it's a dead-lock.

As a fact, we don't need to recovery the objects whose nr_objs are zero, these
objects can only be VDI objects which has not been created completely because
of a recovery, after the recovery finishes, gateway would retry to create the
objects and put it into correct location.

Signed-off-by: levin li <xingke.lwp at taobao.com>
 sheep/recovery.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/sheep/recovery.c b/sheep/recovery.c
index 9c5c712..bb71cba 100644
--- a/sheep/recovery.c
+++ b/sheep/recovery.c
@@ -539,17 +539,18 @@ static void screen_object_list(struct recovery_work *rw,
 	int i, j;
 	for (i = 0; i < nr_oids; i++) {
 		nr_objs = get_obj_copy_number(oids[i], rw->cur_vinfo->nr_zones);
 		if (!nr_objs) {
 			dprintf("can not find copy number for object %" PRIx64
 				"\n", oids[i]);
-			dprintf("probably, vdi was created but "
-				"post_cluster_new_vdi() is not called yet\n");
-			/* FIXME: can we wait for post_cluster_new_vdi
-			 *        with a better way? */
-			sleep(1);
-			goto again;
+			/*
+			 * If nr_objs is zero, then the object is a VDI object,
+			 * and the creation of the VDI is in progress, then we
+			 * don't need to recover this object, as after recovery,
+			 * the gateway would retry to complete the creation by
+			 * putting the objects of the VDI into corrent location.
+			 */
+			continue;
 		oid_to_vnodes(rw->cur_vinfo->vnodes, rw->cur_vinfo->nr_vnodes,
 			      oids[i], nr_objs, vnodes);

More information about the sheepdog mailing list