[Sheepdog] [PATCH] sheep: fix a network partition issue

Liu Yuan namei.unix at gmail.com
Mon Oct 24 11:43:31 CEST 2011


On 10/24/2011 05:29 PM, zituan at taobao.com wrote:

> From: Yibin Shen <zituan at taobao.com>
> 
> In some situation, the left node can also receive the confchg event,
> which may be caused by corosync, that will lead to a network partition,
> this patch fix it.
> 


So what kind of situation, and why left node can still process requeset
as it is already being 'left'?

> Signed-off-by: Yibin Shen <zituan at taobao.com>
> ---
>  sheep/group.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/sheep/group.c b/sheep/group.c
> index e22dabc..155247d 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -1467,6 +1467,10 @@ static void sd_leave_handler(struct sheepdog_node_list_entry *left,
>  	struct work_leave *w = NULL;
>  	int i, size;
>  
> +	if (!memcmp(left, &sys->this_node, sizeof(struct sheepdog_node_list_entry))) {
> +		eprintf("BUG: this node can't be on the left list\n");
> +		abort();
> +	}


Use node_cmp() to check node.

Thanks,
Yuan



More information about the sheepdog mailing list