[Sheepdog] [PATCH] sheep: get vdi bitmap correctly in join phase

Tue Sep 13 05:12:32 CEST 2011

On 09/13/2011 08:05 AM, MORITA Kazutaka wrote:
> At Sun, 11 Sep 2011 00:49:20 +0800,
> Liu Yuan wrote:
>> From: Liu Yuan<tailai.ly at taobao.com>
>>
>> After the new node joins, command *collie-vdi-list*
>> cannot list any vdi information, that are created
>> before this node is joined. This is because in the
>> join phase, get_vdi_bitmap_from_all() cannot get corret
>> vdi bitmap *if* there is no node in sd_node_list.
>> This patch moves this function after the function
>> which updates the sd_node_list.
>>
>> With this patch, collie-vdi-list works as expected in
>> newly added nodes.
>>
>> Signed-off-by: Liu Yuan<tailai.ly at taobao.com>
>> ---
>>   sheep/group.c |   13 ++-----------
>>   1 files changed, 2 insertions(+), 11 deletions(-)
>>
>> diff --git a/sheep/group.c b/sheep/group.c
>> index eb0c4e2..22c4f66 100644
>> --- a/sheep/group.c
>> +++ b/sheep/group.c
>> @@ -863,17 +863,6 @@ static void __sd_deliver(struct cpg_event *cevent)
>>   			break;
>>   		}
>>   	}
>> -
>> -	if (m->state == DM_FIN) {
>> -		switch (m->op) {
>> -		case SD_MSG_JOIN:
>> -			if (((struct join_message *)m)->cluster_status == SD_STATUS_OK)
>> -				if (sys->status != SD_STATUS_OK)
>> -					get_vdi_bitmap_from_all();
>> -			break;
>> -		}
>> -	}
>> -
>>   }
>>
>>   static void send_join_response(struct work_deliver *w)
>> @@ -902,6 +891,8 @@ static void __sd_deliver_done(struct cpg_event *cevent)
>>   		switch (m->op) {
>>   		case SD_MSG_JOIN:
>>   			update_cluster_info((struct join_message *)m);
>> +			if (((struct join_message *)m)->cluster_status == SD_STATUS_OK)
>> +					get_vdi_bitmap_from_all();
>>   			break;
>>   		case SD_MSG_LEAVE:
>>   			node = find_node(&sys->sd_node_list, m->nodeid, m->pid);
Hi
     Thanks for your viewing and explanation
> Thanks for your contribution, but this doesn't work.
>
> We cannot sleep in __sd_deliver_done (the main thread), so we cannot
> call get_vdi_bitmap_from_all() in it; if all the sheep daemons call
> get_vdi_bitmap_from_all() at the same time, the cluster will stuck.
>
> I think the right way is doing the followings in __sd_deliver (the
> worker thread):
>
>   - call get_vdi_bitmap_from_all()
>   - extract a new node from join_message, and get a vdi bitmap from the node
>

     I don't quit get it. did you mean first call 
get_vdi_bitmap_from_all() and check then sys->vdi_inuse ourselves or 
something that can help us understand vdi_bitmap. if not as expected, we 
then
extract a new node from join_message, and get a vdi bitmap from the node?

     I don't see helper that get vdi bitmap from one node, and from 
get_vdi_bitmap_from_all(), I gather that
we need to iterate all the nodes to update the vdi_inuse bitmap, no?

      after reading the code more, I think __sd_confchg_done should get 
vdi bitmap done for every node,
*not* only for first cpg node in the group as it is now. Because of it, 
all the nodes except the first-cpg-node will have to call 
update_cluster_info() in a later deliver messge phase (DM_FIN). So my 
question
is, why we call udpate_cluster_info() in different places 
(__sd_confchg_done for first-cpg-node, DM_FIN for
other).

     I came up the idea to do the following minimal changes (just move 
update_cluster_info() upwards)

         if (m->state == DM_FIN) {
                 switch (m->op) {
                 case SD_MSG_JOIN:
                         update_cluster_info((struct join_message *)m);
                         if (((struct join_message *)m)->cluster_status == SD_STATUS_OK)
                                 get_vdi_bitmap_from_all();
                         break;

It fixes the problem on my environment. Is it okay with you? If so, I will send it as V2.

Thanks,
Yuan