[Sheepdog] [PATCH] fix a bug in recovery which makes sheep get an incomplete epoch node list

Liu Yuan namei.unix at gmail.com
Thu Apr 12 03:47:05 CEST 2012


On 04/11/2012 02:50 PM, Li Wenpeng wrote:

> From: levin li <xingke.lwp at taobao.com>
> 
> In do_recover_object(), when recovery fail at some epoch, we need to go
> back to the previous epoch, but get_vnodes_from_epoch() gives an incomplete
> node list for the specified epoch, so the target node from which we will
> recover the object is wrong sometime, in that case, recovery always fails,
> the length we used to read epoch file was shorter than expected, now it's fixed.
> 
> Signed-off-by: levin li <xingke.lwp at taobao.com>
> ---
>  sheep/store.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/sheep/store.c b/sheep/store.c
> index 5d82e5a..739862c 100644
> --- a/sheep/store.c
> +++ b/sheep/store.c
> @@ -916,7 +916,7 @@ int epoch_log_read_remote(uint32_t epoch, char *buf, int len)
>  	char host[128];
>  	struct sd_node nodes[SD_MAX_NODES];
>  
> -	nr = epoch_log_read(le, (char *)nodes, ARRAY_SIZE(nodes));
> +	nr = epoch_log_read(le, (char *)nodes, sizeof(nodes));
>  	nr /= sizeof(nodes[0]);
>  	for (i = 0; i < nr; i++) {
>  		if (is_myself(nodes[i].addr, nodes[i].port))
> @@ -1286,9 +1286,9 @@ static void *get_vnodes_from_epoch(int epoch, int *nr, int *copies)
>  	struct sd_node nodes[SD_MAX_NODES];
>  	void *buf = xmalloc(len);
>  
> -	nodes_nr = epoch_log_read_nr(epoch, (void *)nodes, ARRAY_SIZE(nodes));
> +	nodes_nr = epoch_log_read_nr(epoch, (void *)nodes, sizeof(nodes));
>  	if (nodes_nr < 0) {
> -		nodes_nr = epoch_log_read_remote(epoch, (void *)nodes, ARRAY_SIZE(nodes));
> +		nodes_nr = epoch_log_read_remote(epoch, (void *)nodes, sizeof(nodes));
>  		if (nodes_nr == 0) {
>  			free(buf);
>  			return NULL;


Applied, thanks.

Yuan



More information about the sheepdog mailing list