you can reproduce this hang-up by following steps: 1) do some intensive CPG_EVENT_DELIVER event operation, such as vdi lookup/add/del 2) then stop some node's corosync sequentially On Thu, Sep 15, 2011 at 10:30 AM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp<mailto:morita.kazutaka at lab.ntt.co.jp>> wrote: At Wed, 14 Sep 2011 11:24:14 +0800, zituan at taobao.com<mailto:zituan at taobao.com> wrote: > > From: Yibin Shen <zituan at taobao.com<mailto:zituan at taobao.com>> > > This patch prevents a CPG_EVENT_CONCHG event from blocking VM I/Os. > > for more details, if a CPG_EVENT_CONCHG event occured inside the > CPG_EVENT_DELIVER and CPG_EVENT_REQUEST event pair(for example: > a vdi lookup oreration followed by a meta object read operation), > then whole cluster will hang forever for the meta object operation > be blocked. this patch delays a CPG_EVENT_CONCHG event handling. > > Signed-off-by: Yibin Shen <zituan at taobao.com<mailto:zituan at taobao.com>> > --- > sheep/group.c | 4 +--- > 1 files changed, 1 insertions(+), 3 deletions(-) > > diff --git a/sheep/group.c b/sheep/group.c > index eb0c4e2..b9dd9d7 100644 > --- a/sheep/group.c > +++ b/sheep/group.c > @@ -1487,10 +1487,8 @@ do_retry: > list_for_each_entry_safe(cevent, n, &sys->cpg_event_siblings, cpg_event_list) { > struct request *req = container_of(cevent, struct request, cev); > > - if (cevent->ctype == CPG_EVENT_DELIVER) > + if (cevent->ctype == CPG_EVENT_DELIVER || cevent->ctype == CPG_EVENT_CONCHG) > continue; > - if (cevent->ctype == CPG_EVENT_CONCHG) > - break; The intention of this code is to flush all outstanding I/Os before processing CPG_EVENT_CONCHG. CPG_EVENT_CONCHG causes a epoch update, and we want to avoid it while processing I/O requests to ensure a strong data consistency. The pended CPG_EVENT_CONCHG will be resumed after all outstanding I/Os are finished, so I think this code isn't a problem. If the event isn't resumed properly, there should be a bug in another area. Are there steps to reproduce the hang-up? Anyway, start_cpg_event_work() should be refactored to be more readable, I think. Thanks, Kazutaka -- sheepdog mailing list sheepdog at lists.wpkg.org<mailto:sheepdog at lists.wpkg.org> http://lists.wpkg.org/mailman/listinfo/sheepdog ________________________________ This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you. 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20110915/c9a90d7f/attachment.html> |