[sheepdog] [PATCH v4 0/3] use hyper volume to store containers and objects
Robin Dong
robin.k.dong at gmail.com
Thu Dec 12 11:15:54 CET 2013
From: Robin Dong <sanbai at taobao.com>
The old implemention of kv can only support small object (< 4MB), so we
use hyper volume ( up to 16PB ) to support large number of big object and
add lock to avoid race condition.
After create a account, we will create a hyper volume vdi with the same name
and this vdi will stores inodes of buckets:
account vdi
+-----------+---+--------------------------+---+--------------------------+--
|name: coly |...|bucket_inode (name: jetta)|...|bucket_inode (name: volvo)|..
+-----------+---+--------------------------+---+--------------------------+--
| |
/ |
bucket vdi / |
+-----------------+-------+ <-- |
|name: coly/jetta |.......| |
+-----------------+-------+ /
bucket vdi /
+-----------------+------+ <----
| name: coly/volvo|......|
+-----------------+------+
An account could stores number of "16PB / sizeof(struct bucket_inode)" buckets.
The buckets has two vdis: one called bucket-vdi (named as bucket) and another
called data-vdi (named as "bucket/allocator").The bucket-vdi stores inodes of
objects and data-vdi stores the data of these objects:
--------------------- kv_onode -----------------------
| |
bucket vdi v v
+-----------------+--+---------------------------+--------------------------+
|name: coly/fruit |..|kv_onode_hdr (name: banana)|onode_extent: start, count|
+-----------------+--+---------------------------+--------------------------+
/
/
------------
/
data_vid v
+---------------------------+---+-----------------+
|name: coly/fruit/allocator |...| data |
+---------------------------+---+-----------------+
The total size of data for objects in one bucket could reach 16PB recently, we
will add multi-data-vdi support for bucket in the future.
TODO:
1. add statistics for space in bytes for account/container.
2. one bucket could use many hyper volumes to store data of objects.
3. use kv_update_object() to upload large object parallely.
v3 --> v4:
1. modify the error handling of kv_get_bucket() for understanding easyly
2. change name of "kv_get_bucket()" to "kv_get_lock_bucket()" to notice
it has acquire the lock in fucntion.
3. remove unused 'data_buf'.
4. use DIV_ROUND_UP() instead of hard code.
5. check Content-Length and upload size of object to avoid halting in
upload progress.
v2 --> v3:
1. add lock/unlock to avoid race condition when many nodes
create or delete containers parallely.
v1 --> v2:
1. change to name "onode_vid" in struct bucket_inode.
2. remove "SD_OP_DELETE_CACHE" operation.
3. remove extra blank line when listing containers and objects
4. add "Content-Length" for listing containers and objects
5. remove some warning in kv_delete_bucket()
Robin Dong (3):
sheep/http: store accounts and containers into hyper volume for
object-storage
sheep/http: add support for big object which is larger than
SD_DATA_OBJ_SIZE
sheep/http: add lock to protect container and object
sheep/http/http.c | 5 +
sheep/http/http.h | 1 +
sheep/http/kv.c | 1164 +++++++++++++++++++++++++++++++++++++++++++---------
sheep/http/kv.h | 37 +-
sheep/http/s3.c | 14 +-
sheep/http/swift.c | 151 ++++---
6 files changed, 1099 insertions(+), 273 deletions(-)
--
1.7.12.4
More information about the sheepdog
mailing list