[Sheepdog] [PATCH] sheep: fix race on sys->pending_list

Tue May 8 14:08:20 CEST 2012

On Tue, May 8, 2012 at 7:49 PM, Christoph Hellwig <hch at infradead.org> wrote:
> On Tue, May 08, 2012 at 05:21:40PM +0800, Yunkai Zhang wrote:
>> From: Yunkai Zhang <qiushu.zyk at taobao.com>
>>
>> Actually, there are two race problems when we call do_cluster_request()
>> in IO threads:
>> 1) race on sys->pending_list which would also be updated in sd_notify_handler().
>> 2) calling sys->notify() in IO threads other than main thread is also
>>    mistake.
>>
>> So I move do_cluster_request() into cluster_op_done().
>
> What the reason to bother with a workqueue if you could call it
> directly?

Maybe we can call it directly, but I worry about it will be executed
in front of the works in request_queue(if there are any works in it),
as a result, the EVENT ORDER of IO request might be broken.

>
>> @@ -174,6 +174,7 @@ static void local_op_done(struct work *work)
>>  static void cluster_op_done(struct work *work)
>>  {
>>       /* request is forwarded to cpg group */
>> +     do_cluster_request(work);
>>  }
>
> Also is there any good reason to have the wrapper here?  The comment
> doesn't seem very helpful anymore either.
>
I just want to use this function's name so that it can keep consistent
with work.done, maybe I can assign do_cluster_request to work.done
directly?

-- 
Yunkai Zhang
Work at Taobao