[sheepdog] [PATCH v1 0/5] replace structure of inode->data_vdi_id[] from array to b-tree

Liu Yuan namei.unix at gmail.com
Fri Oct 11 18:39:24 CEST 2013


On Fri, Oct 11, 2013 at 02:20:42PM +0800, Robin Dong wrote:
> Hi all,
> 
> The size of vdi can only reach 4TB beacause the inode->data_vdi_id[] can only
> support 1 million objects. But the 4TB is too small for storage application
> such as NAS and cloud-disk, so we need to change the array of 'data_vdi_id' to
> b-tree.
> 
> This patchset add b-tree structure into sd_inode. It support just two levels
> (one root-node and many leaf-nodes) and after this the size of vdi could reach about
> (4MB / sizeof(sd_extent_header) * (4MB / sizeof(sd_extent)) * 4MB = 1024PB which
> is certainly enough for many storage requirement.

This is how big vdi sheep support from inode's perspective. In reality, we are
limited by uint_t oid space for data indexes too. So I think currently we support
at most 4G*4M = 16PB.

I think you'd better brief the backward-compatibility problem (I notice you use
store_policy to handle this issue)

It would be nice if you can explain how btree works in source file.

And I have tried it with 16PB vdi with sheepfs (after some small tweaks to get it work)

1. dog vdi create test 16PB
2. mkdir dir
3. sheepfs dir
4. echo test > dir/vdi/mount
5. mkfs.xfs dir/vdi/test

And it reports error.

And if we want to make object cache work with btree based vdi, we need to extend
Object Cache ID to uint64_t to cache btree inodes.

Thanks
Yuan



More information about the sheepdog mailing list