[sheepdog] [PATCH 2/3] sheep: do the real work of delay recovery

Yunkai Zhang yunkai.me at gmail.com
Mon Jul 30 06:49:29 CEST 2012


On Mon, Jul 30, 2012 at 12:39 PM, MORITA Kazutaka
<morita.kazutaka at lab.ntt.co.jp> wrote:
> At Sun, 29 Jul 2012 22:29:21 +0800,
> Yunkai Zhang wrote:
>>
>> From: Yunkai Zhang <qiushu.zyk at taobao.com>
>>
>> After delay recovery start, all recovery operation in sd_join_handler or
>> sd_leave_handler will be paused. old vnode information will be kept in a newly
>> static variable named old_vnode_info in group.c wich will be used by following
>> recovery operation. a delay_recovery variable was added in join_message so that
>> joining sheep can share cluster's delay_recovery status.
>>
>> During delay recovery transaction, joined and left nodes will be stored into
>> an global array so that we can show inner status to user when necessary(next
>> patch will use it).
>>
>> Only one recovery operation will be executed when user sending
>> "collie delay_recovery stop" command. One flag do_delay_recovery is added to
>> indicate whether there are reovery works to be done. Ceating a new function
>> get_old_vnode_info() so that other code can access old_vnode_info variable.
>>
>> Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com>
>> ---
>>  include/internal_proto.h |  6 ++++
>>  sheep/group.c            | 72 +++++++++++++++++++++++++++++++++++-------------
>>  sheep/ops.c              | 16 +++++++++++
>>  sheep/sheep_priv.h       |  4 +++
>>  4 files changed, 79 insertions(+), 19 deletions(-)
>
> This patch doesn't pass the following test:
>
> ==
> #!/bin/bash
>
> set -ex
>
> sheep /store/0 -z 0 -p 7000
> sheep /store/1 -z 1 -p 7001
> collie cluster format -c 2
> collie delay_recovery start
>
> qemu-img create sheepdog:test 4G
>
> # create 20 objects
> for i in `seq 0 19`; do
>     collie vdi write test $((i * 4 * 1024 * 1024)) 512 < /dev/zero
> done
>
> sheep /store/2 -z 2 -p 7002
>
> # overwrite the objects
> for i in `seq 0 19`; do
>     collie vdi write test $((i * 4 * 1024 * 1024)) 512 < /dev/zero
> done
> ==
>
> IIUC, if sheep receives write requests, it needs to recover the
> objects even if object recovery is delayed.


Thanks for your script, I'll give v2 after fixed it.

>
> Thanks,
>
> Kazutaka



-- 
Yunkai Zhang
Work at Taobao



More information about the sheepdog mailing list