[sheepdog] [RFC] create onode before uploading object completed

Liu Yuan namei.unix at gmail.com
Wed Jan 8 03:55:21 CET 2014


On Wed, Jan 08, 2014 at 11:12:41AM +0900, MORITA Kazutaka wrote:
> At Wed, 8 Jan 2014 09:34:00 +0800,
> Robin Dong wrote:
> > 
> > > At Mon, 6 Jan 2014 17:16:22 +0800,
> > > Robin Dong wrote:
> > > >
> > > > Hi All,
> > > >
> > > > At present, the implemention of swift interface for creating object in
> > > > sheepdog is:
> > > >
> > > > 1. lock container
> > > > 2. check whether the onode with same object name is exists.
> > > > 3. unlock container
> > > > 4. upload object
> > > > 5. create onode
> > > >
> > > > this sequence have a problem: if two clients uploading same objects
> > > > concurrently, it will create two
> > > > objects with same names in container.To avoid duplicated names, we must
> > > put
> > > > "create onode"
> > > > operation in container lock regions.
> > > >
> > > > Therefore we need to change the processes of creating object to:
> > > >
> > > > 1. lock container
> > > > 2. check whether the onode is exists.
> > > > 3. allocate data space for object, and create onode, then write it done
> > > > 4. unlock container
> > > > 5. upload object
> > > >
> > > > this routine will avoid uploading duplicated objects.
> > > >
> > > > There is an exception on the new routine: if the client halt the
> > > uploading
> > > > progress, we will have a
> > > > "uploading uncompleted" onode.
> > > > I think this problem is easy to solve: we can add code for onode to
> > > > identify its status.
> > > > A new onode will be set to "INIT", and after uploading completed, the
> > > onode
> > > > will be set to  "COMPLETED".
> > >
> > > Then, the procedure will be as follows?
> > >
> > >   1. lock container
> > >   2. check whether the onode is exists.
> > >   3. allocate data space for object, and create onode, then write it done
> > >   4. mark the onode as "INIT"
> > >   5. unlock container
> > >   6. upload object
> > >   7. mark the onode as "COMPLETED"
> > >
> > > I'm not against this suggestion, but I'm wondering whether we can get
> > > enough performance with this approach.  IIUC, this introduces
> > > additional write requests to the created onode at 7.
> > >
> > 
> > Hi  MORITA,
> > 
> > We may only write the status (may be a "uint_8" type) of onode back at 7. So
> > the performance will not be hurted too much.
> > 
> > 
> > > I've been evaluating Swift these days and noticed that Swift can
> > > process thousands of PUT requests per second with only 3 nodes and 100
> > > disks.  Can Sheepdog achieve similar or better performance with the
> > > suggestion?
> > >
> > 
> > At present, the bottleneck of swift on sheepdog is the distributed-lock on
> > each container.
> > Therefore if we send PUT requests on different one hundred containers, I
> > think sheepdog could
> > achieve the similar performance with the suggestion.
> 
> Okay. I think this should be documented somewhere.  I know some
> benchmark tools for Swift which tries to create many objects on one
> container.  Note that Swift shows high performance even if all the
> requests are against one container.

I guess you can test it with local driver + sheepdog http, which should be
current max throughput and lock overhead would be small enough to evaluate the
performance of http code.

Thanks
Yuan



More information about the sheepdog mailing list