[sheepdog] [RFC] create onode before uploading object completed
MORITA Kazutaka
morita.kazutaka at gmail.com
Wed Jan 8 03:12:41 CET 2014
At Wed, 8 Jan 2014 09:34:00 +0800,
Robin Dong wrote:
>
> > At Mon, 6 Jan 2014 17:16:22 +0800,
> > Robin Dong wrote:
> > >
> > > Hi All,
> > >
> > > At present, the implemention of swift interface for creating object in
> > > sheepdog is:
> > >
> > > 1. lock container
> > > 2. check whether the onode with same object name is exists.
> > > 3. unlock container
> > > 4. upload object
> > > 5. create onode
> > >
> > > this sequence have a problem: if two clients uploading same objects
> > > concurrently, it will create two
> > > objects with same names in container.To avoid duplicated names, we must
> > put
> > > "create onode"
> > > operation in container lock regions.
> > >
> > > Therefore we need to change the processes of creating object to:
> > >
> > > 1. lock container
> > > 2. check whether the onode is exists.
> > > 3. allocate data space for object, and create onode, then write it done
> > > 4. unlock container
> > > 5. upload object
> > >
> > > this routine will avoid uploading duplicated objects.
> > >
> > > There is an exception on the new routine: if the client halt the
> > uploading
> > > progress, we will have a
> > > "uploading uncompleted" onode.
> > > I think this problem is easy to solve: we can add code for onode to
> > > identify its status.
> > > A new onode will be set to "INIT", and after uploading completed, the
> > onode
> > > will be set to "COMPLETED".
> >
> > Then, the procedure will be as follows?
> >
> > 1. lock container
> > 2. check whether the onode is exists.
> > 3. allocate data space for object, and create onode, then write it done
> > 4. mark the onode as "INIT"
> > 5. unlock container
> > 6. upload object
> > 7. mark the onode as "COMPLETED"
> >
> > I'm not against this suggestion, but I'm wondering whether we can get
> > enough performance with this approach. IIUC, this introduces
> > additional write requests to the created onode at 7.
> >
>
> Hi MORITA,
>
> We may only write the status (may be a "uint_8" type) of onode back at 7. So
> the performance will not be hurted too much.
>
>
> > I've been evaluating Swift these days and noticed that Swift can
> > process thousands of PUT requests per second with only 3 nodes and 100
> > disks. Can Sheepdog achieve similar or better performance with the
> > suggestion?
> >
>
> At present, the bottleneck of swift on sheepdog is the distributed-lock on
> each container.
> Therefore if we send PUT requests on different one hundred containers, I
> think sheepdog could
> achieve the similar performance with the suggestion.
Okay. I think this should be documented somewhere. I know some
benchmark tools for Swift which tries to create many objects on one
container. Note that Swift shows high performance even if all the
requests are against one container.
I think it would be worth considering to remove the container lock and
support a eventual consistency model, though it is a future work.
Thanks,
Kazutaka
More information about the sheepdog
mailing list