[sheepdog] [PATCH 0/4] introduce slice to farm
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Thu Jul 18 09:22:54 CEST 2013
At Tue, 16 Jul 2013 17:30:17 +0800,
Liu Yuan wrote:
>
> Slice is a fixed chunk of one object to be stored in farm. We slice
> the object into smaller chunks to get better deduplication.
>
> For a test with 200M cluster with 2 copies (so roughly 100M data to backup),
> I got the following resualt:
>
> size time compress ratio
> w/ slice (64K) : 51M 2.037s 49%
> w/ slice (128K) : 53M 1.223s 47%
> w/ slice (256K) : 57M 1.216s 43%
> w/ slice (512K) : 61M 1.205s 39%
> w/o slice (4M) : 97M 1.174s 3%
>
> I choose 128K slice size.
>
> I actually tried further more -- compress the slice before writing to disk.
> But due to the images are virtually random files, I didn't get any compression
> with zlib, but spent much more time to backup.
>
> You can try the test zlib patch on top of this series. Please drop zlib patch
> to merge the patch set.
>
> Liu Yuan (4):
> sheep: don't trim before calculating sha1
> farm: clean up trunk.c
> farm: slice.c proper
> farm: use slice_{read, write} to read/write object
>
> collie/Makefile.am | 2 +-
> collie/farm/farm.c | 4 +-
> collie/farm/farm.h | 4 +-
> collie/farm/sha1_file.c | 20 +++------
> collie/farm/slice.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++
> collie/farm/trunk.c | 4 +-
> sheep/plain_store.c | 5 ---
> 7 files changed, 122 insertions(+), 26 deletions(-)
> create mode 100644 collie/farm/slice.c
Applied except the first one, thanks!
Thanks,
Kazutaka
More information about the sheepdog
mailing list