[sheepdog] [PATCH v5 0/4] use hyper volume to store containers and objects

Liu Yuan namei.unix at gmail.com
Fri Dec 13 13:08:58 CET 2013


On Fri, Dec 13, 2013 at 06:55:13PM +0800, Robin Dong wrote:
> From: Robin Dong <sanbai at taobao.com>
> 
> The old implemention of kv can only support small object (< 4MB), so we
> use hyper volume ( up to 16PB ) to support large number of big object and
> add lock to avoid race condition.
> 
> After create a account, we will create a hyper volume vdi with the same name
> and this vdi will stores inodes of buckets:
> 
>   account vdi
>   +-----------+---+--------------------------+---+--------------------------+--
>   |name: coly |...|bucket_inode (name: jetta)|...|bucket_inode (name: volvo)|..
>   +-----------+---+--------------------------+---+--------------------------+--
>                                    |                             |
>                                   /                              |
>   bucket vdi                     /                               |
>   +-----------------+-------+ <--                                |
>   |name: coly/jetta |.......|                                    |
>   +-----------------+-------+                                   /
>                                bucket vdi                      /
>                                +-----------------+------+ <----
>                                | name: coly/volvo|......|
>                                +-----------------+------+
> 
> An account could stores number of "16PB / sizeof(struct bucket_inode)" buckets.
> 
> The buckets has two vdis: one called bucket-vdi (named as bucket) and another
> called data-vdi (named as "bucket/allocator").The bucket-vdi stores inodes of
> objects and data-vdi stores the data of these objects:
> 
>  
>                         --------------------- kv_onode -----------------------
>                        |                                                      |
>   bucket vdi           v                                                      v
>   +-----------------+--+---------------------------+--------------------------+
>   |name: coly/fruit |..|kv_onode_hdr (name: banana)|onode_extent: start, count|
>   +-----------------+--+---------------------------+--------------------------+
>                                                                    /
>                                                                   /
>                                                       ------------
>                                                      /
>  		     data_vid                        v
>                     +---------------------------+---+-----------------+
>                     |name: coly/fruit/allocator |...|       data      |
>                     +---------------------------+---+-----------------+
> 
> The total size of data for objects in one bucket could reach 16PB recently, we
> will add multi-data-vdi support for bucket in the future.
> 
> TODO:
> 	1. add statistics for space in bytes for account/container.
> 	2. one bucket could use many hyper volumes to store data of objects.
> 	3. use kv_update_object() to upload large object parallely.
> 
> v4 --> v5:
> 	1. add kv_find_object() to lookup object before create or delete it.
> 	2. discard oid after delete a onode.

Applied this patch set and let's roll on it, thanks.

Yuan



More information about the sheepdog mailing list