[sheepdog] Is it necessary for outstanding io block leave/join event?
Liu Yuan
namei.unix at gmail.com
Fri May 18 04:04:33 CEST 2012
On 05/17/2012 07:41 PM, Liu Yuan wrote:
> On 05/17/2012 04:01 PM, Liu Yuan wrote:
>
>> On 05/17/2012 03:29 PM, MORITA Kazutaka wrote:
>>
>>>>>
>>>>> This assumption seems not necessary, at least to Farm, where I/O will
>>>>> always be routed into objects in the working directory.
>>> Really? I thought that this problem does not depend on the underlying
>>> storage driver.
>>>
>>> If there are 1 node, A, and the number of copies is 1, how does
>>> Farm handle the following case?
>>>
>>> - the user add the second node B, and there is in-flight I/Os on
>>> node A
>>> - the node A increments the epoch from 1 to 2, and the node B recovers
>>> objects from epoch 1 on node A
>>> - after node B receives objects to epoch 2, the in-flight I/Os on
>>> node A updates objects in epoch 1 on node A.
>>
>
>
> With the second thought, seems that this case doesn't exist at all. When
> Node B tries to recover the object from A, it will find the targeted
> object is busy, and the recovery request will be placed on
> sys->req_wait_for_obj_list.
>
> Thanks,
> Yuan
>
>>
>> This is really a race problem for Farm for now. But I think we can
>> exclude it by:
>>
>> 1) ask recovery request in A to check if the requested oid is on the
>> outstanding list.
>> 2) if yes, A put it on a waiting list, if no, service the requests
>> 3) the in-fly IO finished on A, check if there is any request on waiting
>> list, if yes, resume it.
>>
>> I think this algorithm will allow us a finer blocking for the request
>> who really need blocking. Our current algo will block all the requests,
>> most of them will be poor victim.
>>
Is there any agreement to remove simple store? If yes, I am going to
rework abstracted store layer and Farm to get concurrent access of object.
Thanks,
Yuan
More information about the sheepdog
mailing list