[sheepdog] [PATCH v5 13/14] doc: add documentation of object reclaim algorithm

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Tue Mar 4 06:42:57 CET 2014


Cc: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
Cc: Valerio Pachera <sirio81 at gmail.com>
Cc: Alessandro Bolgia <alessandro at extensys.it>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
---

v5:
 - data_ref -> gref

 doc/object-reclaim.txt |   73 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)
 create mode 100644 doc/object-reclaim.txt

diff --git a/doc/object-reclaim.txt b/doc/object-reclaim.txt
new file mode 100644
index 0000000..4cef5f5
--- /dev/null
+++ b/doc/object-reclaim.txt
@@ -0,0 +1,73 @@
+This is a brief explanation of the object reclaim algorithm.  If you want to
+know the details or correctness proof of the generational reference counting,
+please refer the original paper [1]:
+==
+
+Each data object contains the following structure in its inode:
+
+    struct generation_reference {
+        int32_t generation;
+        int32_t count;
+    } gref;
+
+The generation field identifies the generation of the reference and
+the count field records the number of references copied from this
+particular reference.
+
+We also introduce a new type of object, a ledger object.  The ledger
+object exists for each data object and contains an array of int32_t
+values. The i-th element of the array contains information about the
+number of outstanding data object references in the i-th generation.
+Generational reference counting is performed as follows:
+
+1. When a vdi A is created, the initial data object references are
+   initialized as follows:
+
+       for (i = 0; i < MAX_DATA_OBJS; i++) {
+           A.gref[i].generation = 0;
+           A.gref[i].count = 0;
+       }
+
+   and a ledger object for each data object is initialized as follows:
+
+       ledger_object[0] = 1;
+       for (i = 1; i < MAX_DATA_OBJS; i++) {
+           ledger_object[i] = 0;
+       }
+
+   In practice, the ledger object is created and initialized when the
+   object is accessed for the first time, so we don't have to create
+   all the ledger objects when the vdi is created.
+
+2. When a VDI B is created as a clone (or snapshot) of A, the new
+   reference B is initialized as follows:
+
+       for (i = 0; i < MAX_DATA_OBJS; i++) {
+           B.gref[i].generation = A.gref[i].generation + 1;
+           B.gref[i].count = 0;
+       }
+
+   In addition, A.gref.count's are incremented:
+
+       for (i = 0; i < MAX_DATA_OBJS; i++) {
+           A.gref[i].count++;
+       }
+
+3. When a object o is removed, a decrement message is sent to its
+   ledger object containing the values of o.generation and o.count.
+
+4. When a decrement message to o is received, in which the generation
+   field has the value i and the count field has the value c, the
+   following actions are performed:
+
+   o.ledger[i]--;
+   o.ledger[i + 1] += c;
+
+   When every element of the ledger becomes zero (for all i >= 0,
+   o.ledger[i] = 0), o can be reclaimed.
+
+   Note that some elements of the ledger may hold negative values.
+
+
+[1] B. Goldberg, Generation reference counting: A reduced
+communication distributed storage reclamation scheme, PLDI '89
-- 
1.7.10.4




More information about the sheepdog mailing list