[sheepdog] [PATCH v4] sheep/md: change process of for_each_object_in_wd() to multi-threads
MORITA Kazutaka
morita.kazutaka at gmail.com
Fri Mar 21 08:52:26 CET 2014
At Tue, 18 Mar 2014 15:06:55 +0800,
Liu Yuan wrote:
>
> On Tue, Mar 18, 2014 at 12:58:51AM +0900, MORITA Kazutaka wrote:
> > At Wed, 26 Feb 2014 14:43:33 +0800,
> > Robin Dong wrote:
> > >
> > > In our test environment, we upload 6TB data and kill one node, then the sheep
> > > daemons on each server will try to rename files from 'data' to '.stale', but
> > > the rename operations cost almost half an hour. And in this half an hour, the
> > > whole cluster can't be read or write.
> > >
> > > To accelerate the speed of renames, we change for_each_object_in_wd() to
> > > multi-threads, which is one thread for one disk.
> > >
> > > Signed-off-by: Robin Dong <sanbai at taobao.com>
> > > ---
> > > v3-->v4:
> > > 1. add free() for thread_args and thread_array
> > >
> > > v2-->v3:
> > > 1. optimize for_each_object_in_wd() instead of for_each_object_in_stale()
> > >
> > > v1-->v2:
> > > 1. support unlimited number of disk
> > > 2. panic() if create thread fail
> > > 3. change sd_warn to sd_err
> > >
> > > sheep/md.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> > > 1 file changed, 68 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/sheep/md.c b/sheep/md.c
> > > index e7e8ec2..03e3d37 100644
> > > --- a/sheep/md.c
> > > +++ b/sheep/md.c
> > > @@ -354,20 +354,85 @@ const char *md_get_object_path(uint64_t oid)
> > > return p;
> > > }
> > >
> > > +struct process_path_arg {
> > > + const char *path;
> > > + int (*func)(uint64_t oid, const char *path, uint32_t epoch, void *arg);
> > > + bool cleanup;
> > > + void *opaque;
> > > + int result;
> > > +};
> > > +
> > > +static void *thread_process_path(void *arg)
> > > +{
> > > + int ret = SD_RES_SUCCESS;
> > > + struct process_path_arg *parg = (struct process_path_arg *)arg;
> > > +
> > > + ret = for_each_object_in_path(parg->path, parg->func, parg->cleanup,
> > > + parg->opaque);
> > > + if (ret != SD_RES_SUCCESS)
> > > + parg->result = ret;
> > > +
> > > + return arg;
> > > +}
> >
> > Although this patch is already in the master branch, I think this code
> > has a race condition problem. for_each_object_in_path() calls
> > get_vnode_info() to check whether each object is stale or not.
> > However, get_vnode_info() is not thread-safe and we cannot call it
> > outside of the main thread.
> >
>
> Probably no race. This can be seen as executed in main thread because
> - main thread activate it and is blocking for thread_process_path execution
> - when thread_process_path() all finishes, then main thread keep going and
> return control.
>
> It looks a bit tricky, but actually this means we mutli-thread the execution
> of our main thread only, no?
Okay, makes sense to me. I think the rationale should have been added
into the source code as a comment. In addition,
- We should add the main_fn marker to for_each_object_in_wd() because
this patch is wrong if the function is called in worker threads.
- get_vnode_info() should be called only once in
for_each_object_in_wd() (in the main thread) and the result should
be passed to thread_process_path() as a pthread argument. This
avoids lots of calls to get_vnode_info() and is necessary to pass
our thread checker of the sheepdog tracer.
Thanks,
Kazutaka
More information about the sheepdog
mailing list