[Sheepdog] [PATCH v4 10/12] farm: add a documentation for farm internals

Liu Yuan namei.unix at gmail.com
Sun Dec 25 16:42:56 CET 2011


From: Liu Yuan <tailai.ly at taobao.com>


Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
---
 doc/farm-internal.txt |  121 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 121 insertions(+), 0 deletions(-)
 create mode 100644 doc/farm-internal.txt

diff --git a/doc/farm-internal.txt b/doc/farm-internal.txt
new file mode 100644
index 0000000..ad470f3
--- /dev/null
+++ b/doc/farm-internal.txt
@@ -0,0 +1,121 @@
+                        ==================
+                            Farm Store
+                        ==================
+
+              Liu Yuan <namei.unix at gmail.com> Taobao Inc.
+
+1.  OVERVIEW
+
+Farm is an object store for Sheepdog on node basis. It consists of backend
+store, which caches the snapshot objects, and working drrectory, storing
+objects that Sheepdog currently operates. That being said, the I/O performance
+for VM Guests would be practically the same as Simple Store.
+
+Snapshots are triggered either by system recovery code or users, and Farm is
+supposed to restore all the object states into the ones at the time of the user
+snapshot being taken. Snapshot object in the context means both meta object and
+data object.
+
+2.  DESIGN
+
+Simply put, Farm somewhat resembles git a lot (both code and idea level).
+there are three object type, named 'data, trunk, snapshot[*]' that is
+similar to git's 'blob, tree, commit'.
+
+[*] shorten to 'snap' below.
+
+'data' object is just Sheepdog's I/O object, only named by its sha1-ed
+content. So the data objects with the same content will be mapped to only
+single sha1 file, thus achieve node-wide data sharing.
+
+'trunk' object ties data objects together into a flat directory structure at
+the time of the snapshot being taken. The trunk object provides a means to
+find old data objects in the store.
+
+'snap' object describes the snapshot, either initiated by users or triggered
+by recovery code. The snap object refers to one of the trunk objects. The two
+snap log files provides a means to name the desired snap object.
+
+All the objects are depicted in the context of snapshotting or retrieving old
+data from the snapshotted objects, that is, those objects are 'cached' into
+Farm store by performing snapshot operations.
+
+2.  OBJECT LAYOUT
+
+All the objects(snap, trunk, data) in the Farm is based on the operations of
+the sha1_file. sha1_file provides us compressed and consistency-aware
+characteristics independent of content or the type of the object.
+
+The object successfully inflates to a stream of bytes that forms a sequence of
+
+    <sha1_file_hdr>  + <binary object data>
+          |                     |
+        header               payload
+
+The payload of the data object is the compressed content of Sheepdog's I/O object.
+
+For trunk object, the compressed content is
+
+    <array of the struct trunk_entry>
+
+    struct trunk_entry {
+            uint64_t oid;
+            unsigned char sha1[SHA1_LEN];
+    };
+
+For snap object, the compressed content is
+
+	<trunk_sha1> + <array of the struct sd_node>
+
+As for snap operations, besides snap object, Farm has two log files with the below
+structure
+
+    struct snap_log {
+            int epoch;
+            uint64_t time;
+            unsigned char sha1[SHA1_LEN];
+    };
+
+This provides an internal naming mechanism and help us find snap objects by epoch.
+
+3 STALE OBJECT
+
+For storing one object into backend store when the snapshot is taken, either
+
+    a) no content change, then point to the same old sha1_file (no stale object)
+    or
+    b) content updated, then will point to a new object with a new sha1.
+
+We need to remove stale object in case b), only in the assumption that it is the
+object generated by recovery code. [*]
+
+When we try store new snapshot object into the backend store, it is safe and
+good timing for us to remove the old object with the same object ID.
+
+For user snapshot objects, we don't need to remove them until the snapshot is deleted.
+
+[*] Here I assume we don't need to restore to 'sys epoch' state.
+
+4.  VIRTUAL FIGURE
+
+
+                  sys_snap, user_snap      snapshot requests
+                          |                          |
+                          |put/get snap_sha1         | trigger
+                          v                          |
+   +----------+        +------+        +--------+    v       +----------+
+   |          |<------>| snap |<++++++>|        | <========> |          |
+   |          |        +------+        |        |            | Farm     |
+   |          |                        | trunk  |            | Working  |   I/O   +-------+
+   |          |<---------------------->|        |            | Directory| <~~~~~~>|sheep  |
+   | Farm     |                        +--------+            |          |         +-------+
+   | Backend  |                                              |          |
+   | Store    |                                              |          |
+   |          |<-------------------------------------------->|          |
+   |          |                                              |          |
+   +----------+                                              +----------+
+
+<-----> put/get objects to/from Farm Store
+<+++++> put/get trunk_sha1 to/from snap object
+<=====> put/get oid/oid_sha1 pairs to/from trunk object
+
-- 
1.7.8.rc3




More information about the sheepdog mailing list