[Sheepdog] [PATCH] sheep: handle CPG_EVENT_REQUEST even if CPG_EVENT_CONCHG exists
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Thu Sep 15 04:30:17 CEST 2011
At Wed, 14 Sep 2011 11:24:14 +0800,
zituan at taobao.com wrote:
>
> From: Yibin Shen <zituan at taobao.com>
>
> This patch prevents a CPG_EVENT_CONCHG event from blocking VM I/Os.
>
> for more details, if a CPG_EVENT_CONCHG event occured inside the
> CPG_EVENT_DELIVER and CPG_EVENT_REQUEST event pair(for example:
> a vdi lookup oreration followed by a meta object read operation),
> then whole cluster will hang forever for the meta object operation
> be blocked. this patch delays a CPG_EVENT_CONCHG event handling.
>
> Signed-off-by: Yibin Shen <zituan at taobao.com>
> ---
> sheep/group.c | 4 +---
> 1 files changed, 1 insertions(+), 3 deletions(-)
>
> diff --git a/sheep/group.c b/sheep/group.c
> index eb0c4e2..b9dd9d7 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -1487,10 +1487,8 @@ do_retry:
> list_for_each_entry_safe(cevent, n, &sys->cpg_event_siblings, cpg_event_list) {
> struct request *req = container_of(cevent, struct request, cev);
>
> - if (cevent->ctype == CPG_EVENT_DELIVER)
> + if (cevent->ctype == CPG_EVENT_DELIVER || cevent->ctype == CPG_EVENT_CONCHG)
> continue;
> - if (cevent->ctype == CPG_EVENT_CONCHG)
> - break;
The intention of this code is to flush all outstanding I/Os before
processing CPG_EVENT_CONCHG. CPG_EVENT_CONCHG causes a epoch update,
and we want to avoid it while processing I/O requests to ensure a
strong data consistency.
The pended CPG_EVENT_CONCHG will be resumed after all outstanding I/Os
are finished, so I think this code isn't a problem. If the event
isn't resumed properly, there should be a bug in another area. Are
there steps to reproduce the hang-up?
Anyway, start_cpg_event_work() should be refactored to be more
readable, I think.
Thanks,
Kazutaka
More information about the sheepdog
mailing list