[sheepdog] [PATCH 1/2] zookeeper: handling lost block/notify events during session timeout

Liu Yuan namei.unix at gmail.com
Tue Jul 2 08:51:02 CEST 2013


On Mon, Jul 01, 2013 at 11:31:56PM -0700, Kai Zhang wrote:
> If zookeeper session has timeout, zk_block() and zk_notify() will just
> do nothing, and the block/notify event will be missed.
> However, these events have been added to pending_block_list and
> pending_notify_list. If it comes a new cluster operation, this will lead
> to an undefined operation.
> 
> This patch fixed the problem by recalling unhandled block/notify events
> when re-established connection to zookeeper.
> 
> Signed-off-by: Kai Zhang <kyle at zelin.io>
> ---
>  sheep/group.c |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/sheep/group.c b/sheep/group.c
> index 2d4a25c..de315c3 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -1101,13 +1101,32 @@ static int send_join_request(struct sd_node *ent)
>  
>  int sd_reconnect_handler(void)
>  {
> +	struct request *req;
> +
>  	sys->status = SD_STATUS_WAIT_FOR_JOIN;
>  	sys->join_finished = false;
> +
>  	if (sys->cdrv->init(sys->cdrv_option) != 0)
>  		return -1;
>  	if (send_join_request(&sys->this_node) != 0)
>  		return -1;
>  

I'd suggest add a helper function requeue_cluster_requests() to be more
self-explained and add a comment why you call it here.

Thanks,
Yuan



More information about the sheepdog mailing list