[sheepdog] [PATCH v6 00/16] A new impelmentation of cluster snapshot

Kai Zhang kyle at zelin.io
Tue May 21 12:11:02 CEST 2013


*Patch version info*
v6:
1. implement for_each_object_in_trunk() in trunk.c
2. change variable name prefix "obj_" to "object_" in the patch
3. remove HEX_LEN and NAME_LEN from sheep_priv.h because they are not used anymore.
4. adopt other suggestions of trivial issues.
5. move some modifications from patch 5 to patch 4 to make them clearer.

v5:
1. changed object_rb_tree.c to object_tree.c and made all function names in the file start with "object_tree_"
2. load snapshot can handle index correctly
3. modified tests/030 to add a test of loading snapshot by index
4. replaced sd_xprintf(...) with fprintf(...) in collie due to sd_xprintf(...) is only supposed to be used in sheep.
5. merged patch 11/12 and patch 14/15.
6. reordered patches to make every commit can be compile.
7. removed unnecessary checks of NULL before calling free().
8. don't inline get_object_directory() in farm.c
9. make farm.c concise by using LIST_HEAD and reducing paremeter number.
10. use vdi_is_snapshot() to check if vdi is snapshot.
11. check index and tag existence before format cluster.
12. remove unneccessary modification of sheep/Makefile.am.

v4:
1. After loading snapshot, create active vdi for each vdi chain based on the last vdi snapshot.
2. Use uint8_t other than bool for flag set_bitmap in sd_req.
3. Remove duplucate implementations of sha1_hash() in sha1_file.
4. Do not use "${localpath}/.farm" as default farm directory, just use the ${localpath} user specified.
5. Try to create directory if path to save snapshot dose not exist.
6. Support of using tag or index to select a snapshot to be loaded to a new cluster
7. Do a lot for generating more graceful patch.

v3: including 10/12, 11/12 and 12/12 which are missed in the last email.

v2: including new implementation of cluster snapshot

*Patch description*
Current way of doing cluster snapshot is very powerful and has great performance.
However it also has some drawbacks:
1. After a new node joins the cluster, all former snapshots are not available to be restored.
2. It is hard to backup cluster snapshot to an other storage system for disaster-tolerance.
3. It is hard to init a new cluster by loading another cluster's snapshot.

New idea is to move "farm" from sheep to collie and save cluster snapshot to a localpath.

New cluster snapshot retains all features from "farm", including:
1. object de-duplication
2. incremental store capability

In addition, it also provides ability of:
1. export cluster snapshot to other storage device for backup and disaster-tolerance
2. deploy new cluster by restore from one snapshot of other cluster

*Command usage*
save all readonly objects to local path
tag is used to describe a snapshot
$collie cluster snapshot save tag /localpath

list all cluster snapshot saved in local path
$collie cluster snapshot list /localpath

load a snapshot to a cluster
this will format cluster firstly
user can use tag or index to select a snapshot
$collie cluster snapshot load tag|idx /localpath

*TODO*
1. compression of snapshot data in sha1 file
2. only read snapshot objects created after the latest cluster snapshot taken
3. reduce the size of sha1 file for better data de-duplication
4. support for saving snapshot to other storage systems, including s3, hdfs, etc.


Kai Zhang (16):
  sheep: change default store driver from "farm" to "plain"
  sheep: don't compile sheep/farm
  collie: remove snapshot from cluster subcommand
  sheep: remove farm logic from sheep
  sheep: store.c don't include farm.h
  sheep/farm: remove sheep/farm/farm.h
  script: remove script/simple2farm
  collie/farm: implement object_tree
  collie/farm: implement sha1_file
  collie/farm: implement snap object
  collie/farm: implement trunk object
  sheep: add a flag to let notify_vdi_add set bitmap if needed
  collie/farm: implement farm
  collie: fix collie failure when sub-subcommand has more than 2
    arguments
  collie: implement "collie cluster snapshot" subcommand
  test: add tests/030 for cluster snapshot

 collie/Makefile.am                 |   10 +-
 collie/cluster.c                   |  200 ++++++++++++-------
 collie/collie.h                    |    4 +
 collie/common.c                    |    2 +-
 {sheep => collie}/farm/farm.c      |  380 ++++++++++++++++++++----------------
 {sheep => collie}/farm/farm.h      |   58 ++++---
 collie/farm/object_tree.c          |  133 +++++++++++++
 {sheep => collie}/farm/sha1_file.c |   38 ++--
 {sheep => collie}/farm/snap.c      |   75 ++++----
 {sheep => collie}/farm/trunk.c     |  123 +++++--------
 collie/vdi.c                       |    2 +-
 include/sheepdog_proto.h           |   12 +-
 script/simple2farm                 |   51 -----
 sheep/Makefile.am                  |    4 +-
 sheep/ops.c                        |   65 +------
 sheep/sheep_priv.h                 |    3 -
 sheep/store.c                      |    7 +-
 sheep/vdi.c                        |    1 +
 tests/001.out                      |    2 +-
 tests/002.out                      |    2 +-
 tests/003.out                      |    2 +-
 tests/004.out                      |    2 +-
 tests/005.out                      |    2 +-
 tests/006.out                      |    2 +-
 tests/007.out                      |    4 +-
 tests/008.out                      |    2 +-
 tests/009.out                      |    2 +-
 tests/010.out                      |    2 +-
 tests/013.out                      |    2 +-
 tests/014.out                      |    2 +-
 tests/015.out                      |    2 +-
 tests/016.out                      |    2 +-
 tests/017.out                      |    2 +-
 tests/018.out                      |    2 +-
 tests/019.out                      |    2 +-
 tests/020.out                      |    2 +-
 tests/021.out                      |    2 +-
 tests/022.out                      |    2 +-
 tests/023.out                      |    2 +-
 tests/024.out                      |    2 +-
 tests/025.out                      |    2 +-
 tests/026.out                      |    2 +-
 tests/027.out                      |    2 +-
 tests/028.out                      |    2 +-
 tests/029.out                      |    2 +-
 tests/030                          |  144 +++++++++-----
 tests/030.out                      |   44 ++++-
 tests/031.out                      |    2 +-
 tests/032.out                      |    2 +-
 tests/033.out                      |    2 +-
 tests/034.out                      |    2 +-
 tests/035.out                      |    2 +-
 tests/036.out                      |    2 +-
 tests/037.out                      |    2 +-
 tests/038.out                      |    2 +-
 tests/039.out                      |    2 +-
 tests/040.out                      |    2 +-
 tests/041.out                      |    2 +-
 tests/042.out                      |    2 +-
 tests/043.out                      |    2 +-
 tests/044.out                      |    2 +-
 tests/045.out                      |    2 +-
 tests/046.out                      |    2 +-
 tests/047.out                      |    2 +-
 tests/048.out                      |    2 +-
 tests/049.out                      |    2 +-
 tests/050.out                      |    2 +-
 tests/051.out                      |    2 +-
 tests/052.out                      |    2 +-
 tests/053.out                      |    2 +-
 tests/054.out                      |    2 +-
 tests/055.out                      |    2 +-
 tests/056.out                      |    2 +-
 tests/057.out                      |    2 +-
 tests/058.out                      |    2 +-
 tests/059.out                      |    2 +-
 tests/060.out                      |    2 +-
 tests/061.out                      |    2 +-
 78 files changed, 825 insertions(+), 649 deletions(-)
 rename {sheep => collie}/farm/farm.c (23%)
 rename {sheep => collie}/farm/farm.h (53%)
 create mode 100644 collie/farm/object_tree.c
 rename {sheep => collie}/farm/sha1_file.c (87%)
 rename {sheep => collie}/farm/snap.c (62%)
 rename {sheep => collie}/farm/trunk.c (44%)
 delete mode 100755 script/simple2farm
 rewrite tests/030 (73%)
 rewrite tests/030.out (83%)




More information about the sheepdog mailing list