[sheepdog] [PATCH] sheep:Fix when the recovery is complete. ./stale directory still have objects.

Liu Yuan namei.unix at gmail.com
Mon Mar 18 10:47:12 CET 2013


On 03/15/2013 05:09 PM, wangfan wrote:
> If sheep process have differert disk size,after recovery. ./stale directory
> still have objects.because in the cluster_recovery_completion.the memcmp function
> parameter vnode_info->nodes and recovereds will have different nr_vnodes
> number after recovery completion.so i just compare node_id to fix this problem.
> 

The subject title is too long, I'd suggest

sheep: fix automatic stale object purging

> Signed-off-by: wangfan <wangfan1985 at gmail.com>
> ---
>  sheep/ops.c   |    8 +++++---
>  tests/054     |   35 +++++++++++++++++++++++++++++++++++
>  tests/054.out |   22 ++++++++++++++++++++++
>  tests/group   |    1 +
>  4 files changed, 63 insertions(+), 3 deletions(-)
>  create mode 100755 tests/054
>  create mode 100644 tests/054.out
> 
> diff --git a/sheep/ops.c b/sheep/ops.c
> index b9634ae..706346a 100644
> --- a/sheep/ops.c
> +++ b/sheep/ops.c
> @@ -626,9 +626,11 @@ static int cluster_recovery_completion(const struct sd_req *req,
>  
>  	vnode_info = get_vnode_info();
>  
> -	if (vnode_info->nr_nodes == nr_recovereds &&
> -	    memcmp(vnode_info->nodes, recovereds,
> -		   sizeof(*recovereds) * nr_recovereds) == 0) {
> +	if (vnode_info->nr_nodes == nr_recovereds){
> +		for (i = 0; i <= nr_recovereds; ++i){
> +			if (node_id_cmp(&vnode_info->nodes[i].nid , &recovereds[i].nid))
> +				break;

break only break out 'for' statements. I'd suggest use 'goto' jump out
of the whole if statement.

> +		}
>  		sd_dprintf("all nodes are recovered at epoch %d", epoch);
>  		if (sd_store->cleanup)
>  			sd_store->cleanup();
> diff --git a/tests/054 b/tests/054
> new file mode 100755
> index 0000000..c7336dc
> --- /dev/null
> +++ b/tests/054
> @@ -0,0 +1,35 @@
> +#!/bin/bash
> +
> +# Test after recovery is completed.useless files have been deleted.
> +
> +seq=`basename $0`
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1        # failure is the default!
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +_cleanup
> +#start three in different size
> +for i in 0 1 2; do
> +	 $SHEEP $STORE/$i -d -z $i -p 700$i -c local store/$i -s $(($i+1))000 -y 127.0.0.1
> +done

You should use _start_sheep $i "-s $(($i+1))000"

> +_wait_for_sheep 3
> +$COLLIE cluster format
> +sleep 1
> +$COLLIE vdi create test 100M -P
> +$COLLIE cluster info | _filter_cluster_info
> +
> +#start recovery
> +$SHEEP $STORE/3 -d -z 3 -p 7003 -c local store/3 -s 4000 -y 127.0.0.1

Same here.

> +sleep 1
> +_wait_for_sheep 4
> +$COLLIE cluster info | _filter_cluster_info
> +
> +sleep 10

use _wait_for_sheep_recovery 0

> +#test no object in .stale
> +ls $STORE/*/obj/.stale |sort

need space before 'sort'.

And last, I'd suggest separate this patch into 2 patches, one to fix
sheep, the other for tests/054.

Thanks,
Yuan



More information about the sheepdog mailing list