On 09/13/2011 08:05 AM, MORITA Kazutaka wrote: > At Sun, 11 Sep 2011 00:49:20 +0800, > Liu Yuan wrote: >> From: Liu Yuan<tailai.ly at taobao.com> >> >> After the new node joins, command *collie-vdi-list* >> cannot list any vdi information, that are created >> before this node is joined. This is because in the >> join phase, get_vdi_bitmap_from_all() cannot get corret >> vdi bitmap *if* there is no node in sd_node_list. >> This patch moves this function after the function >> which updates the sd_node_list. >> >> With this patch, collie-vdi-list works as expected in >> newly added nodes. >> >> Signed-off-by: Liu Yuan<tailai.ly at taobao.com> >> --- >> sheep/group.c | 13 ++----------- >> 1 files changed, 2 insertions(+), 11 deletions(-) >> >> diff --git a/sheep/group.c b/sheep/group.c >> index eb0c4e2..22c4f66 100644 >> --- a/sheep/group.c >> +++ b/sheep/group.c >> @@ -863,17 +863,6 @@ static void __sd_deliver(struct cpg_event *cevent) >> break; >> } >> } >> - >> - if (m->state == DM_FIN) { >> - switch (m->op) { >> - case SD_MSG_JOIN: >> - if (((struct join_message *)m)->cluster_status == SD_STATUS_OK) >> - if (sys->status != SD_STATUS_OK) >> - get_vdi_bitmap_from_all(); >> - break; >> - } >> - } >> - >> } >> >> static void send_join_response(struct work_deliver *w) >> @@ -902,6 +891,8 @@ static void __sd_deliver_done(struct cpg_event *cevent) >> switch (m->op) { >> case SD_MSG_JOIN: >> update_cluster_info((struct join_message *)m); >> + if (((struct join_message *)m)->cluster_status == SD_STATUS_OK) >> + get_vdi_bitmap_from_all(); >> break; >> case SD_MSG_LEAVE: >> node = find_node(&sys->sd_node_list, m->nodeid, m->pid); Hi Thanks for your viewing and explanation > Thanks for your contribution, but this doesn't work. > > We cannot sleep in __sd_deliver_done (the main thread), so we cannot > call get_vdi_bitmap_from_all() in it; if all the sheep daemons call > get_vdi_bitmap_from_all() at the same time, the cluster will stuck. > > I think the right way is doing the followings in __sd_deliver (the > worker thread): > > - call get_vdi_bitmap_from_all() > - extract a new node from join_message, and get a vdi bitmap from the node > I don't quit get it. did you mean first call get_vdi_bitmap_from_all() and check then sys->vdi_inuse ourselves or something that can help us understand vdi_bitmap. if not as expected, we then extract a new node from join_message, and get a vdi bitmap from the node? I don't see helper that get vdi bitmap from one node, and from get_vdi_bitmap_from_all(), I gather that we need to iterate all the nodes to update the vdi_inuse bitmap, no? after reading the code more, I think __sd_confchg_done should get vdi bitmap done for every node, *not* only for first cpg node in the group as it is now. Because of it, all the nodes except the first-cpg-node will have to call update_cluster_info() in a later deliver messge phase (DM_FIN). So my question is, why we call udpate_cluster_info() in different places (__sd_confchg_done for first-cpg-node, DM_FIN for other). I came up the idea to do the following minimal changes (just move update_cluster_info() upwards) if (m->state == DM_FIN) { switch (m->op) { case SD_MSG_JOIN: update_cluster_info((struct join_message *)m); if (((struct join_message *)m)->cluster_status == SD_STATUS_OK) get_vdi_bitmap_from_all(); break; It fixes the problem on my environment. Is it okay with you? If so, I will send it as V2. Thanks, Yuan |