[sheepdog] [PATCH 0/4] introduce slice to farm
Liu Yuan
namei.unix at gmail.com
Tue Jul 16 11:30:17 CEST 2013
Slice is a fixed chunk of one object to be stored in farm. We slice
the object into smaller chunks to get better deduplication.
For a test with 200M cluster with 2 copies (so roughly 100M data to backup),
I got the following resualt:
size time compress ratio
w/ slice (64K) : 51M 2.037s 49%
w/ slice (128K) : 53M 1.223s 47%
w/ slice (256K) : 57M 1.216s 43%
w/ slice (512K) : 61M 1.205s 39%
w/o slice (4M) : 97M 1.174s 3%
I choose 128K slice size.
I actually tried further more -- compress the slice before writing to disk.
But due to the images are virtually random files, I didn't get any compression
with zlib, but spent much more time to backup.
You can try the test zlib patch on top of this series. Please drop zlib patch
to merge the patch set.
Liu Yuan (4):
sheep: don't trim before calculating sha1
farm: clean up trunk.c
farm: slice.c proper
farm: use slice_{read, write} to read/write object
collie/Makefile.am | 2 +-
collie/farm/farm.c | 4 +-
collie/farm/farm.h | 4 +-
collie/farm/sha1_file.c | 20 +++------
collie/farm/slice.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++
collie/farm/trunk.c | 4 +-
sheep/plain_store.c | 5 ---
7 files changed, 122 insertions(+), 26 deletions(-)
create mode 100644 collie/farm/slice.c
--
1.7.9.5
More information about the sheepdog
mailing list