[sheepdog] [PATCH 1/2] zookeeper: handling lost block/notify events during session timeout
Liu Yuan
namei.unix at gmail.com
Tue Jul 2 08:51:02 CEST 2013
On Mon, Jul 01, 2013 at 11:31:56PM -0700, Kai Zhang wrote:
> If zookeeper session has timeout, zk_block() and zk_notify() will just
> do nothing, and the block/notify event will be missed.
> However, these events have been added to pending_block_list and
> pending_notify_list. If it comes a new cluster operation, this will lead
> to an undefined operation.
>
> This patch fixed the problem by recalling unhandled block/notify events
> when re-established connection to zookeeper.
>
> Signed-off-by: Kai Zhang <kyle at zelin.io>
> ---
> sheep/group.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/sheep/group.c b/sheep/group.c
> index 2d4a25c..de315c3 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -1101,13 +1101,32 @@ static int send_join_request(struct sd_node *ent)
>
> int sd_reconnect_handler(void)
> {
> + struct request *req;
> +
> sys->status = SD_STATUS_WAIT_FOR_JOIN;
> sys->join_finished = false;
> +
> if (sys->cdrv->init(sys->cdrv_option) != 0)
> return -1;
> if (send_join_request(&sys->this_node) != 0)
> return -1;
>
I'd suggest add a helper function requeue_cluster_requests() to be more
self-explained and add a comment why you call it here.
Thanks,
Yuan
More information about the sheepdog
mailing list