[sheepdog] [PATCH 1/4] sheep: fix typo in help information

Robin Dong robin.k.dong at gmail.com
Tue Nov 26 09:38:01 CET 2013


2013/11/26 MORITA Kazutaka <morita.kazutaka at gmail.com>

> At Mon, 25 Nov 2013 17:02:06 +0800,
> Liu Yuan wrote:
> >
> > On Mon, Nov 25, 2013 at 05:43:19PM +0900, MORITA Kazutaka wrote:
> > > At Mon, 25 Nov 2013 15:03:46 +0800,
> > > Robin Dong wrote:
> > > >
> > > > The present implementation of http/swift is not perfect, it can't
> create
> > > > too much containers or objects. So we want to store all objects in
> one
> > > > hyper volume vdi and use new structure 'obj-inode' to identify its
> offset
> > > > and length in this vdi, just like some local file system. To achieve
> this,
> > > > we need distributed locks to ensure that only one thread can create
> a new
> > > > 'obj-inode' (or delete) in this vdi at a same time.
> > > >
> > > > This patch set is a try to implement the distributed lock.
> > > >
> > > > If we add code in sheep/cluster/zookeeper.c and use the framework of
> > > > cluster to implement this distributed lock, then we have to add
> > > > implementation for corosync、local and shepherd. That's too
> complicated. So
> > > > what we need is adding lock.c in sheep/http/ and only use it in http
> > > > interface.
> > >
> > > If possible, I don't like to see zookeeper specific codes out side of
> > > sheep/cluster/zookeeper.c.  Can we use a SD_OP_TYPE_CLUSTER operation
> > > for your purpose?  It works like a cluster-wide distributed lock.
> > >
> > > For example, vdi creation works like as follows.
> > >
> > >  1. When sheep receives a SD_OP_NEW_VDI operation, sheep calls
> > >     cdrv->block() to block all the other cluster operations.
> > >
> > >  2. Sheep calls cluster_new_vdi() in sd_block_handler().  It is
> > >     ensured that no other sheep call sd_block_handler() at the same
> > >     time.  This is necessary here because sheepdog doesn't allow
> > >     concurrent vdi creation requests.
> > >
> > >  3. All the sheep in the cluster call post_cluster_new_vdi() in
> > >     sd_notify_handler().  It is usually used for notification or
> > >     cleanups.
> > >
> >
> > I don't think this approach is effecient though it is simpler because we
> can
> > make use of exsiting mechanism, since:
> >
> > - it can't scale, meaning there is only one lock in the cluster.
> >   And every object creations from different containers will try to
> compete for
> >   this lock.
> >
> > - can be affected by operations even not related to http operations. For
> example,
> >   'vdi create' will block the cluster, it means before it unblocks the
> cluster,
> >   we can't create/delete objects|container at all.
> >
> > I think a lock per operation is really needed. E.g, every container has
> a lock
> > to achieve concurence of creating objects and won't interfere with other
> > containers.
>
> Getting a distributed lock is an expensive operation and it can causes
> a severe performance problem if we do it for each object creation.
> Can we find another way?  Sheepdog is not designed to allow concurrent
> write access.
>

It will hurt performance if the object is very small, but for big object
(1GB,10GB,100GB), we only need to lock at "create object inode" moment,
after that, the object-uploading operation do not need the lock.

I have tested this zookeeper lock, it could lock/unlock 200 times per
second, which I think is not too slow even for small objects.


> For example, how about determining one gateway based on the hash value
> of the requested container name, and forwarding write requests to the
> appropriate gateway so that all the objects in the same container is
> accessed from only one gateway?
>
> Thanks,
>
> Kazutaka
>



-- 
--
Best Regard
Robin Dong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20131126/835ce360/attachment-0004.html>


More information about the sheepdog mailing list