[Sheepdog] failed node and space reclaiming

Sun Jul 24 09:54:31 CEST 2011

At Sun, 24 Jul 2011 14:33:29 +1200,
Michael wrote:
> Hi All,
> 
> Testing sheepdog.
> 
> create cluster for 3node with --copies=2
> created vdi with 2 gb. start VM write date to vdi
> dd if=/dev/zero of=/dev/vda bs=1M count=2000
> 
> collie node info
> Id      Size    Used    Use%
>  0      20 GB   740 MB    3%
>  1      20 GB   636 MB    3%
>  2      4.8 GB  632 MB   12%
> Total   44 GB   2.0 GB    4%, total virtual VDI Size    2.0 GB
> 
> so far so good.

You specified '--copies=2' to the format option, so the total used
data size should be 2 x 2000 MB (= 3.9 GB), shouldn't it?  Probably,
the VM had not synced all the data yet at this time?

> 
> now simulate node failure ( on node3 by killling sheep )
> after recovery complited:
> collie node info
> Id      Size    Used    Use%
>  0      20 GB   2.0 GB   10%
>  1      19 GB   1.9 GB    9%
> Total   39 GB   3.8 GB    9%, total virtual VDI Size    2.0 GB
> 
> start sheep on 3 node again - wait for recovery, after finished:
> collie node info
> Id      Size    Used    Use%
>  0      19 GB   1.4 GB    7%
>  1      19 GB   1.2 GB    6%
>  2      3.5 GB  1.2 GB   34%
> Total   41 GB   3.9 GB    9%, total virtual VDI Size    2.0 GB
> 
> one more failure on the same node and bring it back again:
> collie node info
> Id      Size    Used    Use%
>  0      18 GB   1.4 GB    7%
>  1      18 GB   1.2 GB    6%
>  2      2.3 GB  1.2 GB   53%
> 
> Total   39 GB   3.9 GB   10%, total virtual VDI Size    2.0 GB
> 
> now the node number 3 using 3x times more space than it should and node 1
> and 2 2x times.

I think this shows the correct used size.

> 
> Also seems that after failure it is trying ot copy all data over the other
> node. It could be a good idea just to copy changed data. ( like drbd )

Yes.  It is nice to support a differential copy for the fast object
recovery.  I've added it to our TODO list:
  https://github.com/collie/sheepdog/issues/24

Thanks,

Kazutaka