[Sheepdog] [PATCH] sheep: don't exit when sheep calls leave_cluster()

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Nov 24 06:35:03 CET 2011


At Thu, 24 Nov 2011 11:55:01 +0800,
Liu Yuan wrote:
> 
> From: Liu Yuan <tailai.ly at taobao.com>
> 
> When some unrecoverable error happens, sheep daemon will leave the cluster but stay
> as a gate to redirect requests.
> 
> For e.g, fllowing case is sheep meets an EIO
> ...
> Nov 24 10:36:15 do_io_request(785) failed: 2, 2, 7c2b2500000000 , 1, 3
> Nov 24 10:36:15 io_op_done(147) leaving sheepdog cluster
> Nov 24 10:36:15 sd_leave_handler(1291) network partition bug: this sheep should have exited
> Nov 24 10:36:15 log_sigsegv(358) logger pid 8255 exiting abnormally
> ...
> 
> Thit has nothing to do with network partition stuff.
> 
> Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
> ---
>  sheep/group.c |    3 ---
>  1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/sheep/group.c b/sheep/group.c
> index f126de5..31d1f76 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -1287,9 +1287,6 @@ static void sd_leave_handler(struct sheepdog_node_list_entry *left,
>  	struct work_leave *w = NULL;
>  	int i, size;
>  
> -	if (node_cmp(left, &sys->this_node) == 0)
> -		panic("network partition bug: this sheep should have exited\n");
> -
>  	dprintf("leave %s\n", node_to_str(left));
>  	for (i = 0; i < nr_members; i++)
>  		dprintf("[%x] %s\n", i, node_to_str(members + i));

It is better to stop calling join/leave handlers after the node leaves
the cluster.  It is the way Sheepdog did before introducing a cluster
driver.

Thanks,

Kazutaka



More information about the sheepdog mailing list