[sheepdog] [PATCH 0/4] introduce slice to farm

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Jul 18 09:22:54 CEST 2013


At Tue, 16 Jul 2013 17:30:17 +0800,
Liu Yuan wrote:
> 
> Slice is a fixed chunk of one object to be stored in farm. We slice
> the object into smaller chunks to get better deduplication.
> 
> For a test with 200M cluster with 2 copies (so roughly 100M data to backup),
> I got the following resualt:
> 
>                    size  time    compress ratio
> w/ slice (64K)  :  51M   2.037s       49%
> w/ slice (128K) :  53M   1.223s       47%
> w/ slice (256K) :  57M   1.216s       43%
> w/ slice (512K) :  61M   1.205s       39%
> w/o slice (4M)  :  97M   1.174s       3%
> 
> I choose 128K slice size.
> 
> I actually tried further more -- compress the slice before writing to disk.
> But due to the images are virtually random files, I didn't get any compression
> with zlib, but spent much more time to backup.
> 
> You can try the test zlib patch on top of this series. Please drop zlib patch
> to merge the patch set.
> 
> Liu Yuan (4):
>   sheep: don't trim before calculating sha1
>   farm: clean up trunk.c
>   farm: slice.c proper
>   farm: use slice_{read, write} to read/write object
> 
>  collie/Makefile.am      |    2 +-
>  collie/farm/farm.c      |    4 +-
>  collie/farm/farm.h      |    4 +-
>  collie/farm/sha1_file.c |   20 +++------
>  collie/farm/slice.c     |  109 +++++++++++++++++++++++++++++++++++++++++++++++
>  collie/farm/trunk.c     |    4 +-
>  sheep/plain_store.c     |    5 ---
>  7 files changed, 122 insertions(+), 26 deletions(-)
>  create mode 100644 collie/farm/slice.c

Applied except the first one, thanks!

Thanks,

Kazutaka



More information about the sheepdog mailing list