At Sun, 24 Jul 2011 14:33:29 +1200, Michael wrote: > Hi All, > > Testing sheepdog. > > create cluster for 3node with --copies=2 > created vdi with 2 gb. start VM write date to vdi > dd if=/dev/zero of=/dev/vda bs=1M count=2000 > > collie node info > Id Size Used Use% > 0 20 GB 740 MB 3% > 1 20 GB 636 MB 3% > 2 4.8 GB 632 MB 12% > Total 44 GB 2.0 GB 4%, total virtual VDI Size 2.0 GB > > so far so good. You specified '--copies=2' to the format option, so the total used data size should be 2 x 2000 MB (= 3.9 GB), shouldn't it? Probably, the VM had not synced all the data yet at this time? > > now simulate node failure ( on node3 by killling sheep ) > after recovery complited: > collie node info > Id Size Used Use% > 0 20 GB 2.0 GB 10% > 1 19 GB 1.9 GB 9% > Total 39 GB 3.8 GB 9%, total virtual VDI Size 2.0 GB > > start sheep on 3 node again - wait for recovery, after finished: > collie node info > Id Size Used Use% > 0 19 GB 1.4 GB 7% > 1 19 GB 1.2 GB 6% > 2 3.5 GB 1.2 GB 34% > Total 41 GB 3.9 GB 9%, total virtual VDI Size 2.0 GB > > one more failure on the same node and bring it back again: > collie node info > Id Size Used Use% > 0 18 GB 1.4 GB 7% > 1 18 GB 1.2 GB 6% > 2 2.3 GB 1.2 GB 53% > > Total 39 GB 3.9 GB 10%, total virtual VDI Size 2.0 GB > > now the node number 3 using 3x times more space than it should and node 1 > and 2 2x times. I think this shows the correct used size. > > Also seems that after failure it is trying ot copy all data over the other > node. It could be a good idea just to copy changed data. ( like drbd ) Yes. It is nice to support a differential copy for the fast object recovery. I've added it to our TODO list: https://github.com/collie/sheepdog/issues/24 Thanks, Kazutaka |