[sheepdog] [RFC] create onode before uploading object completed

MORITA Kazutaka morita.kazutaka at gmail.com
Wed Jan 8 03:12:41 CET 2014


At Wed, 8 Jan 2014 09:34:00 +0800,
Robin Dong wrote:
> 
> > At Mon, 6 Jan 2014 17:16:22 +0800,
> > Robin Dong wrote:
> > >
> > > Hi All,
> > >
> > > At present, the implemention of swift interface for creating object in
> > > sheepdog is:
> > >
> > > 1. lock container
> > > 2. check whether the onode with same object name is exists.
> > > 3. unlock container
> > > 4. upload object
> > > 5. create onode
> > >
> > > this sequence have a problem: if two clients uploading same objects
> > > concurrently, it will create two
> > > objects with same names in container.To avoid duplicated names, we must
> > put
> > > "create onode"
> > > operation in container lock regions.
> > >
> > > Therefore we need to change the processes of creating object to:
> > >
> > > 1. lock container
> > > 2. check whether the onode is exists.
> > > 3. allocate data space for object, and create onode, then write it done
> > > 4. unlock container
> > > 5. upload object
> > >
> > > this routine will avoid uploading duplicated objects.
> > >
> > > There is an exception on the new routine: if the client halt the
> > uploading
> > > progress, we will have a
> > > "uploading uncompleted" onode.
> > > I think this problem is easy to solve: we can add code for onode to
> > > identify its status.
> > > A new onode will be set to "INIT", and after uploading completed, the
> > onode
> > > will be set to  "COMPLETED".
> >
> > Then, the procedure will be as follows?
> >
> >   1. lock container
> >   2. check whether the onode is exists.
> >   3. allocate data space for object, and create onode, then write it done
> >   4. mark the onode as "INIT"
> >   5. unlock container
> >   6. upload object
> >   7. mark the onode as "COMPLETED"
> >
> > I'm not against this suggestion, but I'm wondering whether we can get
> > enough performance with this approach.  IIUC, this introduces
> > additional write requests to the created onode at 7.
> >
> 
> Hi  MORITA,
> 
> We may only write the status (may be a "uint_8" type) of onode back at 7. So
> the performance will not be hurted too much.
> 
> 
> > I've been evaluating Swift these days and noticed that Swift can
> > process thousands of PUT requests per second with only 3 nodes and 100
> > disks.  Can Sheepdog achieve similar or better performance with the
> > suggestion?
> >
> 
> At present, the bottleneck of swift on sheepdog is the distributed-lock on
> each container.
> Therefore if we send PUT requests on different one hundred containers, I
> think sheepdog could
> achieve the similar performance with the suggestion.

Okay. I think this should be documented somewhere.  I know some
benchmark tools for Swift which tries to create many objects on one
container.  Note that Swift shows high performance even if all the
requests are against one container.

I think it would be worth considering to remove the container lock and
support a eventual consistency model, though it is a future work.

Thanks,

Kazutaka



More information about the sheepdog mailing list