[sheepdog] [PATCH v4] sheep/md: change process of for_each_object_in_wd() to multi-threads

MORITA Kazutaka morita.kazutaka at gmail.com
Mon Mar 17 16:58:51 CET 2014


At Wed, 26 Feb 2014 14:43:33 +0800,
Robin Dong wrote:
> 
> In our test environment, we upload 6TB data and kill one node, then the sheep
> daemons on each server will try to rename files from 'data' to '.stale', but
> the rename operations cost almost half an hour. And in this half an hour, the
> whole cluster can't be read or write.
> 
> To accelerate the speed of renames, we change for_each_object_in_wd() to
> multi-threads, which is one thread for one disk.
> 
> Signed-off-by: Robin Dong <sanbai at taobao.com>
> ---
> v3-->v4:
>  1. add free() for thread_args and thread_array
> 
> v2-->v3:
>  1. optimize for_each_object_in_wd() instead of for_each_object_in_stale()
> 
> v1-->v2:
>  1. support unlimited number of disk
>  2. panic() if create thread fail
>  3. change sd_warn to sd_err
> 
>  sheep/md.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 68 insertions(+), 3 deletions(-)
> 
> diff --git a/sheep/md.c b/sheep/md.c
> index e7e8ec2..03e3d37 100644
> --- a/sheep/md.c
> +++ b/sheep/md.c
> @@ -354,20 +354,85 @@ const char *md_get_object_path(uint64_t oid)
>  	return p;
>  }
>  
> +struct process_path_arg {
> +	const char *path;
> +	int (*func)(uint64_t oid, const char *path, uint32_t epoch, void *arg);
> +	bool cleanup;
> +	void *opaque;
> +	int result;
> +};
> +
> +static void *thread_process_path(void *arg)
> +{
> +	int ret = SD_RES_SUCCESS;
> +	struct process_path_arg *parg = (struct process_path_arg *)arg;
> +
> +	ret = for_each_object_in_path(parg->path, parg->func, parg->cleanup,
> +				      parg->opaque);
> +	if (ret != SD_RES_SUCCESS)
> +		parg->result = ret;
> +
> +	return arg;
> +}

Although this patch is already in the master branch, I think this code
has a race condition problem.  for_each_object_in_path() calls
get_vnode_info() to check whether each object is stale or not.
However, get_vnode_info() is not thread-safe and we cannot call it
outside of the main thread.

Thanks,

Kazutaka



More information about the sheepdog mailing list