On 04/27/2012 09:15 PM, Yunkai Zhang wrote: > From: Yunkai Zhang <qiushu.zyk at taobao.com> > > Dead lock was found in the following scenario: > > Suppose that there are two sheeps: S1, S2, and their event_queues > are empty. > > Now S1 received a notify message: M1, and call sd_notify_handler() > which will add notify event to its event_queue and than call > process_request_event_queues() to queue_work this event. > > At the same time, S2 send a notify message: M2 to cluster and an > I/O request(eg. do_lookup_vdi operation) was submitted to S1 when > S2 calls zk_dispatch() to handle M2. > > After S1 received I/O request from S2, it would finally call > process_request_event_queues() to deal with this event, if S1 call > this function before M1's event_done() finished, this I/O request > would not to be processed for the event_queue was not empty. This > problem leads to dead lock between S1 and S2, S2 would be blocked > in read() waitting for the data responsed by S1, and the whole cluster > would be suspended forever. > > To fix this problem, we just modify the code in event_done, so that > it can process request_queue after event_queue is empty. > > Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com> > --- > sheep/group.c | 3 +-- > 1 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/sheep/group.c b/sheep/group.c > index b4cf2da..7e19d33 100644 > --- a/sheep/group.c > +++ b/sheep/group.c > @@ -964,8 +964,7 @@ static void event_done(struct work *work) > if (ret) > panic("failed to register event fd"); > > - if (!list_empty(&sys->event_queue)) > - process_request_event_queues(); > + process_request_event_queues(); > } > > int is_access_to_busy_objects(uint64_t oid) Applied. Thanks Yuan |