[sheepdog] [PATCH] sheep: fix a dead-lock bug in screen_object_list()
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Mon Nov 26 10:50:33 CET 2012
At Mon, 26 Nov 2012 13:48:50 +0800,
levin li wrote:
>
> From: levin li <xingke.lwp at taobao.com>
>
> The bug is introduced in the patch 97ccd87ea15e606b6ec9fecb54f5de453f9c5c1f
> which causes the cluster hangs in recovery, this patch is like a revert patch
> except that I add comment to explain why we should not ignore the objects
> whose nr_objs are zero in screen_object_list().
>
> screen_object_list() is called in recovery, if a VDI creation is in progress,
> then the VDI creation operation is blocked by gateway to wait for completion of
> recovery, and if we make screen_object_list() to retry infinitely to wait the
> nr_objs to be not zero, which also means it waits for VDI creation to complete,
> so, both VDI creation and recovery can not finish, it's a dead-lock.
>
> As a fact, we don't need to recovery the objects whose nr_objs are zero, these
> objects can only be VDI objects which has not been created completely because
> of a recovery, after the recovery finishes, gateway would retry to create the
> objects and put it into correct location.
>
> Signed-off-by: levin li <xingke.lwp at taobao.com>
> ---
> sheep/recovery.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/sheep/recovery.c b/sheep/recovery.c
> index 9c5c712..bb71cba 100644
> --- a/sheep/recovery.c
> +++ b/sheep/recovery.c
> @@ -539,17 +539,18 @@ static void screen_object_list(struct recovery_work *rw,
> int i, j;
>
> for (i = 0; i < nr_oids; i++) {
> -again:
> nr_objs = get_obj_copy_number(oids[i], rw->cur_vinfo->nr_zones);
> if (!nr_objs) {
> dprintf("can not find copy number for object %" PRIx64
> "\n", oids[i]);
> - dprintf("probably, vdi was created but "
> - "post_cluster_new_vdi() is not called yet\n");
> - /* FIXME: can we wait for post_cluster_new_vdi
> - * with a better way? */
> - sleep(1);
> - goto again;
> + /*
> + * If nr_objs is zero, then the object is a VDI object,
> + * and the creation of the VDI is in progress, then we
> + * don't need to recover this object, as after recovery,
> + * the gateway would retry to complete the creation by
> + * putting the objects of the VDI into corrent location.
> + */
This is not correct. If recovery happens after cluster_new_vdi()
succeeds in vdi creation and before post_cluster_new_vdi() sets
vdi.copies, there is no one who will retry vdi creation. I actually
encountered this problem before and could reproduce it with the
current test framework.
That being said, dead lock should be definitely avoided. I suggest
adding the problem introduced by this revert as FIXME.
Thanks,
Kazutaka
More information about the sheepdog
mailing list