[sheepdog] [PATCH v5 0/4] use hyper volume to store containers and objects
Liu Yuan
namei.unix at gmail.com
Fri Dec 13 13:08:58 CET 2013
On Fri, Dec 13, 2013 at 06:55:13PM +0800, Robin Dong wrote:
> From: Robin Dong <sanbai at taobao.com>
>
> The old implemention of kv can only support small object (< 4MB), so we
> use hyper volume ( up to 16PB ) to support large number of big object and
> add lock to avoid race condition.
>
> After create a account, we will create a hyper volume vdi with the same name
> and this vdi will stores inodes of buckets:
>
> account vdi
> +-----------+---+--------------------------+---+--------------------------+--
> |name: coly |...|bucket_inode (name: jetta)|...|bucket_inode (name: volvo)|..
> +-----------+---+--------------------------+---+--------------------------+--
> | |
> / |
> bucket vdi / |
> +-----------------+-------+ <-- |
> |name: coly/jetta |.......| |
> +-----------------+-------+ /
> bucket vdi /
> +-----------------+------+ <----
> | name: coly/volvo|......|
> +-----------------+------+
>
> An account could stores number of "16PB / sizeof(struct bucket_inode)" buckets.
>
> The buckets has two vdis: one called bucket-vdi (named as bucket) and another
> called data-vdi (named as "bucket/allocator").The bucket-vdi stores inodes of
> objects and data-vdi stores the data of these objects:
>
>
> --------------------- kv_onode -----------------------
> | |
> bucket vdi v v
> +-----------------+--+---------------------------+--------------------------+
> |name: coly/fruit |..|kv_onode_hdr (name: banana)|onode_extent: start, count|
> +-----------------+--+---------------------------+--------------------------+
> /
> /
> ------------
> /
> data_vid v
> +---------------------------+---+-----------------+
> |name: coly/fruit/allocator |...| data |
> +---------------------------+---+-----------------+
>
> The total size of data for objects in one bucket could reach 16PB recently, we
> will add multi-data-vdi support for bucket in the future.
>
> TODO:
> 1. add statistics for space in bytes for account/container.
> 2. one bucket could use many hyper volumes to store data of objects.
> 3. use kv_update_object() to upload large object parallely.
>
> v4 --> v5:
> 1. add kv_find_object() to lookup object before create or delete it.
> 2. discard oid after delete a onode.
Applied this patch set and let's roll on it, thanks.
Yuan
More information about the sheepdog
mailing list