[sheepdog] [PATCH V4 2/2] collie: optimize 'collie vdi check' command

Yunkai Zhang yunkai.me at gmail.com
Mon Sep 3 16:14:56 CEST 2012


On Mon, Sep 3, 2012 at 9:29 PM, MORITA Kazutaka
<morita.kazutaka at lab.ntt.co.jp> wrote:
> At Mon, 3 Sep 2012 21:02:22 +0800,
> Yunkai Zhang wrote:
>>
>> On Mon, Sep 3, 2012 at 8:40 PM, MORITA Kazutaka
>> <morita.kazutaka at lab.ntt.co.jp> wrote:
>> > At Mon, 3 Sep 2012 20:11:48 +0800,
>> > Yunkai Zhang wrote:
>> >>
>> >> On Mon, Sep 3, 2012 at 7:46 PM, Liu Yuan <namei.unix at gmail.com> wrote:
>> >> > On 09/03/2012 07:12 PM, Yunkai Zhang wrote:
>> >> >> On Mon, Sep 3, 2012 at 7:03 PM, Liu Yuan <namei.unix at gmail.com> wrote:
>> >> >>> On 09/03/2012 06:54 PM, Yunkai Zhang wrote:
>> >> >>>> We can compare this result with comand line tools: sha1sum, which can
>> >> >>>> also help us to verify program's correction I think.
>> >> >>>
>> >> >>> Could you specify a use case for this? I have no idea how end users can
>> >> >>> make use of sha1 string. For a crashed cluster, I think want users need
>> >> >>> to do is, run check to restore cluster consistency by one command like fsck.
>> >> >>
>> >> >> Our root problem is just why we need to move sha1_to_hex from
>> >> >> farm/sha1_file.c to lib/sha1.c, does this change bother you?
>> >> >>
>> >> >
>> >> > Of course, it is ugly. Sha1 is only used in Farm, moving it out of Farm implementation
>> >> > is nonsense to me.
>> >> >
>> >> >> Show sha1 string will make the output more meaningful for us, why need
>> >> >> to a use case? It's a so small/simple thing, do you like argue
>> >> >> everything?
>> >> >>
>> >> >
>> >> > 1) it is my duty to review patches and whoever tries to move sha1_to_hex() out of
>> >> > sha1_file.c, I will ask why he needs it and verify if it is a reasonable movement.
>> >> > 2) it is you who always take patch comments as malicious argument which is meant to
>> >> > be against you but in fact only meaningful to patch itself.
>> >>
>> >> You misunderstand me, I didn't take any comments as malicious argument
>> >> for me, what I think is that you spend so much time to discuss those
>> >> triviality.
>> >>
>> >> Don't you think 'Move sha1.c from farm/sha1_file.c to lib/sha1.c' is
>> >> worth discussing? Time is very precious for us, does it?
>> >
>> > If we need the function outside farm, it is okay to move it to
>> > lib/sha1.c.  I think what Yuan want to know is that how it will help
>> > users to print sha1 values in the collie outputs.  What is the
>> > situation users want to compare hash strings?
>>
>> What the original purpose of this patch is to let user see the
>> different replicas of inconsistent object, instead of repairing it
>> directly.
>>
>> This patch can help us to test cluster's consistency:
>> 1. We need not to write shell script to read raw objects and compare
>> it from each node, especially in a big cluster, using shell to compare
>> is not efficient.
>
> So why do you need to compare each hash strings?  What users want to
> know is not hash strings, but only whether there is object
> inconsistency or not, no?
>
>>
>> 2. This patch can help me to test code when develop program, it can
>> use to check 'delay recovery' which may lead to inconsistency when
>> implement incorrectly. Printing sha1 in output, and comparing this
>> value with sha1sum's output, It can help us to verity program's
>> correction (I have mentioned it in previous email).
>
> The design of delay recovery we agreed before is:
>  - Basically, a recovery process doesn't recover objects even if epoch
>    is incremented
>  - But when clients (qemu, collie, other sheeps) access objects, the
>    recovery process recover the objects.  This is necessary for
>    clients to read the latest objects.
>
> So 'collie vdi check' must read valid objects.  I think we cannot use
> the command for debugging delay recovery.  IMHO, running md5sum

If 'collie vdi check' must read valid objects, what does it check?

Using this patch to debugging is only my selection, I like to use this
command, it can release me to write a script, you can continue to use
md5sum, don't you?

If user do not want to see the sha1's output, they can execute 'collie
vdi check xxx > /dev/null'.

> directly against stored objects on filesystem looks much simpler and
> useful for debugging.



>
> Thanks,
>
> Kazutaka



-- 
Yunkai Zhang
Work at Taobao



More information about the sheepdog mailing list