[sheepdog] [PATCH v2 4/4] cluster driver: handle pending block/notify event during reconnect

Tue Jul 9 04:37:00 CEST 2013

On Jul 8, 2013, at 4:42 PM, Liu Yuan <namei.unix at gmail.com> wrote:

> On Sun, Jul 07, 2013 at 09:20:51PM -0700, Kai Zhang wrote:
>> Current implementation of reconnection doesn't handle pending block/notify
>> event.
>> 
>> It is easy to handle notify event by sending it again.
>> 
>> However, it is a little bit complex for block event.
>> This is because a block event need 4 steps.
>> 1. in queue_cluster_request(), send block event by sys->cdrv->block(), and
>>  add to pending_block_list.
>> 2. in sd_block_handler(), queue the event to work queue of 'block' thread.
>> 3. in cluster_op_done(), send unblock event by sys->cdrv->unblock().
>> 4. in sd_notify_handler(), remove it from pending_block_list.
>> 
>> And step 1 and 3 contains broadcast operations.
>> So we have to know which step has been done for a pending block event.
>> 
>> If step 1 has been done, we can re-queue it simply. (Any block event which sent
>> by this node have been removed due to the leave event)
>> If step 2 has been done, the event is handling by another thread. We have to mark
>> it as 'drop' so that it will be dropped when cluster_op_done() is called later.
>> if step 3 has been done, we should call sd_notify_handler() manually to finish
>> it.
>> 
>> Signed-off-by: Kai Zhang <kyle at zelin.io>
>> ---
>> sheep/group.c      |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>> sheep/sheep_priv.h |    8 ++++++
>> 2 files changed, 75 insertions(+), 2 deletions(-)
>> 
>> diff --git a/sheep/group.c b/sheep/group.c
>> index 2fa4091..1a549de 100644
>> --- a/sheep/group.c
>> +++ b/sheep/group.c
>> @@ -251,6 +251,9 @@ static void cluster_op_done(struct work *work)
>> 	struct vdi_op_message *msg;
>> 	size_t size;
>> 
>> +	if (req->status == REQUEST_DROPPED)
>> +		goto drop;
>> +
>> 	sd_dprintf("%s (%p)", op_name(req->op), req);
>> 
>> 	msg = prepare_cluster_msg(req, &size);
>> @@ -266,6 +269,13 @@ static void cluster_op_done(struct work *work)
>> 	}
>> 
>> 	free(msg);
>> +	req->status = REQUEST_DONE;
>> +	return;
>> +drop:
>> +	list_del(&req->pending_list);
>> +	req->rp.result = SD_RES_CLUSTER_ERROR;
>> +	put_request(req);
>> +	cluster_op_running = false;
>> }
>> 
>> /*
>> @@ -295,6 +305,7 @@ bool sd_block_handler(const struct sd_node *sender)
>> 	req->work.done = cluster_op_done;
>> 
>> 	queue_work(sys->block_wqueue, &req->work);
>> +	req->status = REQUEST_QUEUED;
>> 	return true;
>> }
>> 
>> @@ -329,7 +340,7 @@ void queue_cluster_request(struct request *req)
>> 
>> 		free(msg);
>> 	}
>> -
>> +	req->status = REQUEST_INIT;
>> 	return;
>> error:
>> 	req->rp.result = SD_RES_CLUSTER_ERROR;
>> @@ -927,6 +938,60 @@ static int send_join_request(struct sd_node *ent)
>> 	return ret;
>> }
>> 
>> +static void requeue_cluster_request(void)
>> +{
>> +	struct request *req, *p;
>> +	struct vdi_op_message *msg;
>> +	size_t size;
>> +
>> +	list_for_each_entry_safe(req, p, main_thread_get(pending_notify_list),
>> +				 pending_list) {
>> +		sd_dprintf("found an pending notify request, op: %s",
>> +			   op_name(req->op));
>> +		/*
>> +		 * notify has been sent but sd_notify_handler is never called,
>> +		 * re-queue it
>> +		 */
>> +		list_del(&req->pending_list);
>> +		queue_cluster_request(req);
>> +	}
>> +
>> +	list_for_each_entry_safe(req, p, main_thread_get(pending_block_list),
>> +				 pending_list) {
>> +		switch (req->status) {
>> +		case REQUEST_INIT:
>> +			/* this request has never been executed, re-queue it */
>> +			sd_dprintf("requeue a block request, op: %s",
>> +				   op_name(req->op));
>> +			list_del(&req->pending_list);
>> +			queue_cluster_request(req);
>> +			break;
>> +		case REQUEST_QUEUED:
>> +			/*
>> +			 * This request is handling by the 'block' thread.
> 
> s/handling/being handled/
> 
> I'd suggest putting it more explicitly:
> ---
> This request is being handled by the 'block' thread and ->unblock() isn't called
> yet. We can't call ->unblock thereafter because other sheep has unblocked
> themselves due to cluster driver session timeout. Mark it as dropped to stop
> cluster_op_done() from calling ->unblock.
> ---
>> +			 * Don't send unblock event, because other sheep has
>> +			 * unblocked it. Drop it when cluster_op_done() is
>> +			 * called.
>> +			 */
>> +			sd_dprintf("drop pending block request, op: %s",
>> +				   op_name(req->op));
>> +			req->status = REQUEST_DROPPED;
>> +			break;
>> +		case REQUEST_DONE:
>> +			/*
>> +			 * Unblock has been sent but sd_notify_handler is never
>> +			 * called, call sd_notify_handler to finish it.
>> +			 */
> 
> Would better rephrase as:
> 
> ---
> ->unblock() was called and succeeded but after that this node session-timeouted
> and sd_notify_handler wasn't called from unblock event handler in cluster driver.
> We manually call sd_notify_handler to finish the request.
> ----

Both look good.
I will update them in next version.

Thanks,
Kyle