<div dir="ltr">Reviewed-by: Robin Dong <<a href="mailto:sanbai@taobao.com">sanbai@taobao.com</a>></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-05-26 14:52 GMT+08:00 Liu Yuan <span dir="ltr"><<a href="mailto:namei.unix@gmail.com" target="_blank">namei.unix@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">v4:<br>
- fix slab create and destroy<br>
<div class="HOEnZb"><div class="h5"><br>
v3:<br>
- kill SPINLOCK macro hack by not using 'gnu99'<br>
- make this module compile with 2.6.32 series kernel (centos 6)<br>
<br>
v2:<br>
- enhance memory allocation when near out of memory of system<br>
- add slab allocator<br>
- use vmalloc for aiocb->buf to avoid memory allocation failure<br>
<br>
This is similar to Ceph's RBD. The main motivation is to replace complex<br>
and ineffecient middle ware (such as iscsi softwafe) with simple software stacks<br>
to expose sheepdog storage as Linux block device interface, which means that we<br>
can make use of page cache as a client cache for buffered read/write optionally<br>
and behaves as a normal Linux block device(s) in your local file system.<br>
<br>
I implement a high performance(hopefully) aio framework for sending/recving data<br>
and compared with iscsi tgt or sheepfs, this kernel module should provide much<br>
better performance because of shortest code path.<br>
<br>
With single major allocation scheme, we support 31 partitions for a sheep block<br>
device at most and 32768 devices can be attached to local fs for a single node.<br>
<br>
TODO<br>
- support cloned sheep vdi<br>
- auto-reconncect to sheep daemon if connection is off/crashed<br>
- better error handling<br>
- block device multi-queue support for recent kernel<br>
- live snapshot of sbd<br>
- support hyper volume<br>
<br>
You can access this patch set at origin/sbd.<br>
$ git pull;<br>
$ git checkout -b sbd origin/sbd<br>
<br>
To complile:<br>
$ cd shepdog/sbd/;make<br>
$ insmod sbd.ko<br>
<br>
Usage:<br>
<br>
We control the device the same way as RBD.<br>
<br>
# associate vdi 'test' to /dev/sbd0<br>
$ echo 127.0.0.1 7000 test > /sys/bus/sbd/add<br>
<br>
# remove the device sbd0<br>
$ echo 0 > /sys/bus/sbd/remove<br>
<br>
# list the mapped devices<br>
$ cat /sys/buf/sbd/list<br>
<br>
To get best of performance,<br>
<br>
# echo 4096 > /sys/block/sbd0/queue/max_sectors_kb<br>
<br>
Which means io scheduler will try its best to handle us 4MB request.<br>
Liu Yuan (9):<br>
sheep: some macro preparation for sbd kernem modual<br>
sbd: introduce basic framework for Sheepdog Block Device<br>
sbd: implement write operation<br>
sbd: implement read operation<br>
sbd: add list interface to control file<br>
sbd: add support for single major allocation scheme<br>
sbd: some error handling refinements<br>
sbd: improve memory allocation when memory hit low<br>
sbd: use kmem_cache for sheep aiocb and request<br>
<br>
include/sheepdog_proto.h | 22 +-<br>
sbd/Kbuild | 5 +<br>
sbd/Makefile | 8 +<br>
sbd/sbd.h | 143 ++++++++++<br>
sbd/sheep.c | 680 +++++++++++++++++++++++++++++++++++++++++++++++<br>
sbd/sheep_block_device.c | 424 +++++++++++++++++++++++++++++<br>
6 files changed, 1275 insertions(+), 7 deletions(-)<br>
create mode 100644 sbd/Kbuild<br>
create mode 100644 sbd/Makefile<br>
create mode 100644 sbd/sbd.h<br>
create mode 100644 sbd/sheep.c<br>
create mode 100644 sbd/sheep_block_device.c<br>
<br>
--<br>
1.8.1.2<br>
<br>
--<br>
sheepdog mailing list<br>
<a href="mailto:sheepdog@lists.wpkg.org">sheepdog@lists.wpkg.org</a><br>
<a href="http://lists.wpkg.org/mailman/listinfo/sheepdog" target="_blank">http://lists.wpkg.org/mailman/listinfo/sheepdog</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>--<br>Best Regard<br>Robin Dong
</div>