[sheepdog] [PATCH v5 0/4] use hyper volume to store containers and objects

Robin Dong robin.k.dong at gmail.com
Fri Dec 13 11:55:13 CET 2013


From: Robin Dong <sanbai at taobao.com>

The old implemention of kv can only support small object (< 4MB), so we
use hyper volume ( up to 16PB ) to support large number of big object and
add lock to avoid race condition.

After create a account, we will create a hyper volume vdi with the same name
and this vdi will stores inodes of buckets:

  account vdi
  +-----------+---+--------------------------+---+--------------------------+--
  |name: coly |...|bucket_inode (name: jetta)|...|bucket_inode (name: volvo)|..
  +-----------+---+--------------------------+---+--------------------------+--
                                   |                             |
                                  /                              |
  bucket vdi                     /                               |
  +-----------------+-------+ <--                                |
  |name: coly/jetta |.......|                                    |
  +-----------------+-------+                                   /
                               bucket vdi                      /
                               +-----------------+------+ <----
                               | name: coly/volvo|......|
                               +-----------------+------+

An account could stores number of "16PB / sizeof(struct bucket_inode)" buckets.

The buckets has two vdis: one called bucket-vdi (named as bucket) and another
called data-vdi (named as "bucket/allocator").The bucket-vdi stores inodes of
objects and data-vdi stores the data of these objects:

 
                        --------------------- kv_onode -----------------------
                       |                                                      |
  bucket vdi           v                                                      v
  +-----------------+--+---------------------------+--------------------------+
  |name: coly/fruit |..|kv_onode_hdr (name: banana)|onode_extent: start, count|
  +-----------------+--+---------------------------+--------------------------+
                                                                   /
                                                                  /
                                                      ------------
                                                     /
 		     data_vid                        v
                    +---------------------------+---+-----------------+
                    |name: coly/fruit/allocator |...|       data      |
                    +---------------------------+---+-----------------+

The total size of data for objects in one bucket could reach 16PB recently, we
will add multi-data-vdi support for bucket in the future.

TODO:
	1. add statistics for space in bytes for account/container.
	2. one bucket could use many hyper volumes to store data of objects.
	3. use kv_update_object() to upload large object parallely.

v4 --> v5:
	1. add kv_find_object() to lookup object before create or delete it.
	2. discard oid after delete a onode.

v3 --> v4:
	1. modify the error handling of kv_get_bucket() for understanding easyly
	2. change name of "kv_get_bucket()" to "kv_get_lock_bucket()" to notice
	   it has acquire the lock in fucntion.
	3. remove unused 'data_buf'.
	4. use DIV_ROUND_UP() instead of hard code.
	5. check Content-Length and upload size of object to avoid halting in
	   upload progress.

v2 --> v3:
	1. add lock/unlock to avoid race condition when many nodes
	   create or delete containers parallely.

v1 --> v2:
	1. change to name "onode_vid" in struct bucket_inode.
	2. remove "SD_OP_DELETE_CACHE" operation.
	3. remove extra blank line when listing containers and objects
	4. add "Content-Length" for listing containers and objects
	5. remove some warning in kv_delete_bucket()

Robin Dong (3):
  sheep/http: store accounts and containers into hyper volume for
    object-storage
  sheep/http: add support for big object which is larger than
    SD_DATA_OBJ_SIZE
  sheep/http: add lock to protect container and object

 sheep/http/http.c  |    5 +
 sheep/http/http.h  |    1 +
 sheep/http/kv.c    | 1164 +++++++++++++++++++++++++++++++++++++++++++---------
 sheep/http/kv.h    |   37 +-
 sheep/http/s3.c    |   14 +-
 sheep/http/swift.c |  151 ++++---
 6 files changed, 1099 insertions(+), 273 deletions(-)

-- 
1.7.12.4




More information about the sheepdog mailing list