[sheepdog] [PATCH v2 4/4] cluster driver: handle pending block/notify event during reconnect

MORITA Kazutaka morita.kazutaka at gmail.com
Tue Jul 9 02:46:19 CEST 2013


At Sun,  7 Jul 2013 21:20:51 -0700,
Kai Zhang wrote:
> 
> Current implementation of reconnection doesn't handle pending block/notify
> event.
> 
> It is easy to handle notify event by sending it again.
> 
> However, it is a little bit complex for block event.
> This is because a block event need 4 steps.
> 1. in queue_cluster_request(), send block event by sys->cdrv->block(), and
>   add to pending_block_list.
> 2. in sd_block_handler(), queue the event to work queue of 'block' thread.
> 3. in cluster_op_done(), send unblock event by sys->cdrv->unblock().
> 4. in sd_notify_handler(), remove it from pending_block_list.
> 
> And step 1 and 3 contains broadcast operations.
> So we have to know which step has been done for a pending block event.
> 
> If step 1 has been done, we can re-queue it simply. (Any block event which sent
> by this node have been removed due to the leave event)
> If step 2 has been done, the event is handling by another thread. We have to mark
> it as 'drop' so that it will be dropped when cluster_op_done() is called later.
> if step 3 has been done, we should call sd_notify_handler() manually to finish
> it.
> 
> Signed-off-by: Kai Zhang <kyle at zelin.io>
> ---
>  sheep/group.c      |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>  sheep/sheep_priv.h |    8 ++++++
>  2 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/sheep/group.c b/sheep/group.c
> index 2fa4091..1a549de 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -251,6 +251,9 @@ static void cluster_op_done(struct work *work)
>  	struct vdi_op_message *msg;
>  	size_t size;
>  
> +	if (req->status == REQUEST_DROPPED)

I think adding sd_dprintf here would help us to debug.

> +		goto drop;
> +
>  	sd_dprintf("%s (%p)", op_name(req->op), req);
>  
>  	msg = prepare_cluster_msg(req, &size);
> @@ -266,6 +269,13 @@ static void cluster_op_done(struct work *work)
>  	}
>  
>  	free(msg);
> +	req->status = REQUEST_DONE;
> +	return;
> +drop:
> +	list_del(&req->pending_list);
> +	req->rp.result = SD_RES_CLUSTER_ERROR;
> +	put_request(req);
> +	cluster_op_running = false;
>  }
>  
>  /*


> diff --git a/sheep/sheep_priv.h b/sheep/sheep_priv.h
> index c406534..d2b5364 100644
> --- a/sheep/sheep_priv.h
> +++ b/sheep/sheep_priv.h
> @@ -39,6 +39,13 @@ struct client_info {
>  	int refcnt;
>  };
>  
> +enum REQUST_STATUS {
> +	REQUEST_INIT,
> +	REQUEST_QUEUED,
> +	REQUEST_DONE,
> +	REQUEST_DROPPED
> +};
> +
>  struct request {
>  	struct sd_req rq;
>  	struct sd_rsp rp;
> @@ -61,6 +68,7 @@ struct request {
>  	struct vnode_info *vinfo;
>  
>  	struct work work;
> +	int status;

enum REQUST_STATUS status?

Thanks,

Kazutaka



More information about the sheepdog mailing list