[sheepdog] [Sheepdog] [PATCH 0/4] fix a race when multiple sheep join a cluster very quickly

Yunkai Zhang yunkai.me at gmail.com
Thu May 17 11:03:13 CEST 2012


On Thu, May 17, 2012 at 4:49 PM, Christoph Hellwig <hch at infradead.org> wrote:
> On Thu, May 17, 2012 at 04:45:53PM +0800, Yunkai Zhang wrote:
>> Another potential dead lock bug in __sd_join() that sheep may fetch
>> vdi_bitmap from itself was found and fixed.
>
> get_vdi_bitmap_from already contains an is_myself check, so this
> should't happen.  Otoh moving that check into the caller will
> help with the case where we only get the vdi bitmap from a single node.

You are right. My patch looks like:

for (i = 0; i < w->member_list_entries; i++) {
        /* We should not fetch vdi_bitmap from myself */
        if (node_eq(w->member_list + i, &sys->this_node))
                continue;

        get_vdi_bitmap_from(w->member_list + i);

        /*
         * If a new comer try to join the running cluster, it only
         * need read one copy of bitmap from the first other member
         */
        if (sys_stat_wait_format())
                break;
}

Becase calling get_vdi_bitmap_from(*this_node*) can get nothing, so
the node_eq condition can confirm that sheep can get at least one copy
of bitmap from *other* member when sys_stat_wait_format() return true.



-- 
Yunkai Zhang
Work at Taobao



More information about the sheepdog mailing list