[sheepdog] [PATCH v4] sheep/md: change process of for_each_object_in_wd() to multi-threads

Liu Yuan namei.unix at gmail.com
Tue Mar 18 08:06:55 CET 2014


On Tue, Mar 18, 2014 at 12:58:51AM +0900, MORITA Kazutaka wrote:
> At Wed, 26 Feb 2014 14:43:33 +0800,
> Robin Dong wrote:
> > 
> > In our test environment, we upload 6TB data and kill one node, then the sheep
> > daemons on each server will try to rename files from 'data' to '.stale', but
> > the rename operations cost almost half an hour. And in this half an hour, the
> > whole cluster can't be read or write.
> > 
> > To accelerate the speed of renames, we change for_each_object_in_wd() to
> > multi-threads, which is one thread for one disk.
> > 
> > Signed-off-by: Robin Dong <sanbai at taobao.com>
> > ---
> > v3-->v4:
> >  1. add free() for thread_args and thread_array
> > 
> > v2-->v3:
> >  1. optimize for_each_object_in_wd() instead of for_each_object_in_stale()
> > 
> > v1-->v2:
> >  1. support unlimited number of disk
> >  2. panic() if create thread fail
> >  3. change sd_warn to sd_err
> > 
> >  sheep/md.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 68 insertions(+), 3 deletions(-)
> > 
> > diff --git a/sheep/md.c b/sheep/md.c
> > index e7e8ec2..03e3d37 100644
> > --- a/sheep/md.c
> > +++ b/sheep/md.c
> > @@ -354,20 +354,85 @@ const char *md_get_object_path(uint64_t oid)
> >  	return p;
> >  }
> >  
> > +struct process_path_arg {
> > +	const char *path;
> > +	int (*func)(uint64_t oid, const char *path, uint32_t epoch, void *arg);
> > +	bool cleanup;
> > +	void *opaque;
> > +	int result;
> > +};
> > +
> > +static void *thread_process_path(void *arg)
> > +{
> > +	int ret = SD_RES_SUCCESS;
> > +	struct process_path_arg *parg = (struct process_path_arg *)arg;
> > +
> > +	ret = for_each_object_in_path(parg->path, parg->func, parg->cleanup,
> > +				      parg->opaque);
> > +	if (ret != SD_RES_SUCCESS)
> > +		parg->result = ret;
> > +
> > +	return arg;
> > +}
> 
> Although this patch is already in the master branch, I think this code
> has a race condition problem.  for_each_object_in_path() calls
> get_vnode_info() to check whether each object is stale or not.
> However, get_vnode_info() is not thread-safe and we cannot call it
> outside of the main thread.
> 

Probably no race. This can be seen as executed in main thread because
 - main thread activate it and is blocking for thread_process_path execution
 - when thread_process_path() all finishes, then main thread keep going and
   return control.

It looks a bit tricky, but actually this means we mutli-thread the execution
of our main thread only, no?

Thanks
Yuan



More information about the sheepdog mailing list