[sheepdog] [PATCH v4 0/3] use hyper volume to store containers and objects

Fri Dec 13 06:03:21 CET 2013

On Fri, Dec 13, 2013 at 10:43:33AM +0800, Robin Dong wrote:
> 2013/12/12 Liu Yuan <namei.unix at gmail.com>
> 
> > On Thu, Dec 12, 2013 at 06:15:54PM +0800, Robin Dong wrote:
> > > From: Robin Dong <sanbai at taobao.com>
> > >
> > > The old implemention of kv can only support small object (< 4MB), so we
> > > use hyper volume ( up to 16PB ) to support large number of big object and
> > > add lock to avoid race condition.
> > >
> > > After create a account, we will create a hyper volume vdi with the same
> > name
> > > and this vdi will stores inodes of buckets:
> > >
> > >   account vdi
> > >
> > +-----------+---+--------------------------+---+--------------------------+--
> > >   |name: coly |...|bucket_inode (name: jetta)|...|bucket_inode (name:
> > volvo)|..
> > >
> > +-----------+---+--------------------------+---+--------------------------+--
> > >                                    |                             |
> > >                                   /                              |
> > >   bucket vdi                     /                               |
> > >   +-----------------+-------+ <--                                |
> > >   |name: coly/jetta |.......|                                    |
> > >   +-----------------+-------+                                   /
> > >                                bucket vdi                      /
> > >                                +-----------------+------+ <----
> > >                                | name: coly/volvo|......|
> > >                                +-----------------+------+
> > >
> > > An account could stores number of "16PB / sizeof(struct bucket_inode)"
> > buckets.
> > >
> > > The buckets has two vdis: one called bucket-vdi (named as bucket) and
> > another
> > > called data-vdi (named as "bucket/allocator").The bucket-vdi stores
> > inodes of
> > > objects and data-vdi stores the data of these objects:
> > >
> > >
> > >                         --------------------- kv_onode
> > -----------------------
> > >                        |
> >      |
> > >   bucket vdi           v
> >      v
> > >
> > +-----------------+--+---------------------------+--------------------------+
> > >   |name: coly/fruit |..|kv_onode_hdr (name: banana)|onode_extent: start,
> > count|
> > >
> > +-----------------+--+---------------------------+--------------------------+
> > >                                                                    /
> > >                                                                   /
> > >                                                       ------------
> > >                                                      /
> > >                    data_vid                        v
> > >                     +---------------------------+---+-----------------+
> > >                     |name: coly/fruit/allocator |...|       data      |
> > >                     +---------------------------+---+-----------------+
> > >
> > > The total size of data for objects in one bucket could reach 16PB
> > recently, we
> > > will add multi-data-vdi support for bucket in the future.
> > >
> > > TODO:
> > >       1. add statistics for space in bytes for account/container.
> > >       2. one bucket could use many hyper volumes to store data of
> > objects.
> > >       3. use kv_update_object() to upload large object parallely.
> >
> > I have tried the following steps:
> >
> >  1 create a user /yliu
> >  2 create a container /yliu/computer
> >  3 upload a 100M file A to /yliu/computer/test
> >  4 update the same file A again and it succeded
> >
> > Then I found 'vdi list'
> >
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog vdi list
> >   Name        Id    Size    Used  Shared    Creation time   VDI id  Copies
> >  Tag
> >     yliu/computer/allocator     0   16 PB  204 MB  0.0 MB 2013-12-12 19:18
> >    9ddb4     6
> >
> > note that 204 MB, I think it should be 104 MB because I overwrite the A
> > with the
> > same size.
> >
> >  5 delete the /yliu/computer/test
> >
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog vdi list
> >   Name        Id    Size    Used  Shared    Creation time   VDI id  Copies
> >  Tag
> >     yliu/computer/allocator     0   16 PB  104 MB  0.0 MB 2013-12-12 19:18
> >    9ddb4     6
> >
> > Seems that I will have 100 MB never get released.
> >
> > yliu at ubuntu-precise:~/sheepdog$ curl -X GET
> > http://localhost/v1/yliu/computer
> >
> > Says that I have no file in it!
> >
> 
> As the swift API doc (
> http://docs.openstack.org/api/openstack-object-storage/1.0/content/create-update-object.html)
> says: the
> second create operation is a update operation, but we will not support
> update in this patchset. And, if I return CONFLICT when
> users want to create a object twice, that will not be consistent to swift
> API standard.
> 
> Any suggestion ?

I think it is not the 'update', a second 'PUT' means overwrite, which can be
simply demenstrated as

 1. remove the old data as a whole
 2. create the object with the new data

Thanks
Yuan