[sheepdog] [PATCH 2/2] dog: check nr_copies before format cluster

Liu Yuan namei.unix at gmail.com
Wed Dec 4 10:39:29 CET 2013


On Wed, Dec 04, 2013 at 05:21:26PM +0800, Robin Dong wrote:
> From: Robin Dong <sanbai at taobao.com>
> 
> If we use erasure-code for "4:2" and there are only 4 nodes in cluster, it will
> format cluster successly but crash when writing data into vdi.
> 
> Signed-off-by: Robin Dong <sanbai at taobao.com>
> ---
>  dog/cluster.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/dog/cluster.c b/dog/cluster.c
> index 05e98d2..da6091f 100644
> --- a/dog/cluster.c
> +++ b/dog/cluster.c
> @@ -81,6 +81,12 @@ static int cluster_format(int argc, char **argv)
>  	char store_name[STORE_LEN];
>  	static DECLARE_BITMAP(vdi_inuse, SD_NR_VDIS);
>  
> +	if (cluster_cmd_data.copies > sd_nodes_nr) {
> +		printf("Number of copies (%d) can't be larger than number of "
> +		       "nodes (%d)\n", cluster_cmd_data.copies, sd_nodes_nr);
> +		return EXIT_FAILURE;
> +	}

There is one use case that can be demonstrated by following script:

# with 2 nodes initially 
$ dog cluster format -c 3
...operating...
$ sheep /node3 # add another node at the time when the more nodes are ready

I think this is valid case since at the start people might be short of resources
and add nodes later but still serve the requests with 2 nodes.

So I'd suggest you add a confirm() to this case instead of error return.

Thanks
Yuan





More information about the sheepdog mailing list