At Thu, 17 May 2012 15:48:07 +0800, Liu Yuan wrote: > > On 05/17/2012 03:40 PM, MORITA Kazutaka wrote: > > > I thought that one advantage of the simple_store driver was that it > > uses a syscall link() to copy objects from local older epochs to the > > current epoch, so we could avoid many I/Os in the recovery process. > > However, it seems that the link operation of the farm driver is not > > called at all on my environment. Does Farm do the recovery process in > > the different way from the simple_store driver? > > > farm_link() will be called for multiple nodes events and in a very > unusual corner cases. Actually, for the case you describe Farm works in > a more optimal way: there isn't any operations for the object that isn't > to be migrated to other nodes, save a system call of link() than simple > store. If it is true, I wanted to see the implementation in the recovery core code instead of in the farm driver. But does the optimization work correctly? I couldn't find the code which tries to avoid the redundant link calls, and actually the farm driver couldn't recover objects correctly with the following testcase: [Testcase script] == #!/bin/bash set -ex STORE=$1 # start three sheep daemons for i in 0 1 2; do ./sheep/sheep /store/$i -z $i -p 700$i -W done sleep 1 ./collie/collie cluster format -c 2 -b $STORE # create a pre-allocated vdi ./collie/collie vdi create test 80M -P # stop the 3rd sheep pkill -f "sheep /store/2" # write data to the vdi cat /dev/urandom | ./collie/collie vdi write test # restart the 3rd sheep ./sheep/sheep /store/2 -z 2 -p 7002 -W # wait for object recovery to finish sleep 10 # show md5sum of the vdi on each node for i in 0 1 2; do ./collie/collie vdi read test -p 700$i | md5sum done == [Results] $ ./testcase.sh simple ... (snip) ... + for i in 0 1 2 + ./collie/collie vdi read test -p 7000 + md5sum 6ebd372401d0848734293709bb7b3cb7 - + for i in 0 1 2 + ./collie/collie vdi read test -p 7001 + md5sum 6ebd372401d0848734293709bb7b3cb7 - + for i in 0 1 2 + ./collie/collie vdi read test -p 7002 + md5sum 6ebd372401d0848734293709bb7b3cb7 - $ ./testcase.sh farm ... (snip) ... + for i in 0 1 2 + ./collie/collie vdi read test -p 7000 + md5sum ef8bd9bbc1f140979405ac08abd24541 - + for i in 0 1 2 + ./collie/collie vdi read test -p 7001 + md5sum dee273206981c7f821061310eac90cd3 - + for i in 0 1 2 + ./collie/collie vdi read test -p 7002 + md5sum ca74a3b2e031a20b03c3baa4af9ab9c5 - > > This contributes to Farm to outperform simple store for recovery, > because most objects are not to be migrated at all for a recovery. I'm fine with dropping the simple driver if the above kinds of problems are planed to be fixed in the farm driver. I wish the correctness would be regarded as more important than the performance. Thanks, Kazutaka |