On 11/12/2011 01:43 AM, Christoph Hellwig wrote: > On Sat, Nov 12, 2011 at 12:04:47AM +0800, Liu Yuan wrote: >> I am not sure what is reflink. is it synonym to snapshot? IIC, yes, the >> new storage infrastructure named 'farm' will support clsuter-wide snapshot. > > It basically is a file level snapshot, see here for an introduction: > > http://oss.oracle.com/osswiki/OCFS2/DesignDocs/ReflinkOperation > Thanks for the URL. >> Simply put, it somewhat resembles git a lot (both code and idea level). >> there are three object type, named 'data, trunk, snapshot' that is >> similar to git's 'blob, tree, commit'. >> >> 'data' is just sheepdog's data object, only named by its sha1ed content. >> So the data objects with the same content will be mapped to only single >> sha1 name, thus achieve node-wide data sharing. >> >> 'snapshot' object will serve to support snapshot which contains the >> snapshoted trunk, that is 'directory' of the that-time data objects on >> each node. The trunk object will provide a means to find data objects. >> This will support cluster-wide snapshot. >> >> The 'farm' doesn't has any constraint to object data size. Hope this helps. > > How well do snapshots / COW images work with large objects? > > Either way, I'm looking forward to see your work. > I can not give a simple answer yet before I finish it. For your description about large data size, I have no doubt that it will benefit rotational disk utilizing the underlying extent-like feature and neutralize the seek overhead, but how about network performance for big data transfer? It seems to me that we trade off the fragmentation against wasteful bandwidth. A single 128MB object will fill the 1000Mb nic to the full, so parallel data transfer seems very restricted. Being that said, though I don't have much thought on this issue and my networking knowledge is not qualified to make the pertinent remark. Thanks, Yuan |