[Sheepdog] [PATCH] sheep: fix race on sys->pending_list

Tue May 8 10:16:21 CEST 2012

On Tue, May 8, 2012 at 4:03 PM, Christoph Hellwig <hch at infradead.org> wrote:
> On Tue, May 08, 2012 at 03:48:04PM +0800, Yunkai Zhang wrote:
>> From: Yunkai Zhang <qiushu.zyk at taobao.com>
>>
>> Actually, there are two race problems when we call do_cluster_request()
>> in IO threads:
>> 1) race on sys->pending_list which would also be updated in sd_notify_handler().
>> 2) calling sys->notify() in IO threads other than main thread is also
>>    mistake.
>>
>> So I move do_cluster_request() into cluster_op_done().
>
> From the code simplicity POV this looks great, but why do we bother
> with using a workqueue at all if just executing in the main thread?
>
> Also I'm a bit worried about blocking operations in ->notify, at least
> corosync seems to do blocking network I/O from it.
I'm not familiar with corosync driver, I confirm that zookeeper
dirver's notify()
will not block anything. If corosync's notify() will block, it should be a bug.


-- 
Yunkai Zhang
Work at Taobao