[Sheepdog] [PATCH 1/2] sheep: optimize sheep recovery logic
Liu Yuan
namei.unix at gmail.com
Fri Nov 25 06:59:22 CET 2011
On 11/25/2011 01:55 PM, Liu Yuan wrote:
> On 11/25/2011 01:04 AM, MORITA Kazutaka wrote:
>
>> At Thu, 24 Nov 2011 20:03:17 +0800,
>> Liu Yuan wrote:
>>>
>>> From: Liu Yuan <tailai.ly at taobao.com>
>>>
>>> We don't need to iterate from epoch 1 to hdr->tgt_epoch, since when the
>>> node is recovered from view(membership) change, the current epoch objects
>>> have all the object information that need for subsequent view change.
>>>
>>> prev_rw_epoch is needed, because we need to handle below situation:
>>>
>>> init: node A, B, C.
>>>
>>> then D, E joined the cluster.
>>>
>>> t
>>> epoch 1 2 3
>>> A A A
>>> B B B
>>> C C C
>>> D D
>>> E
>>>
>>> at the time t:
>>> Since now we have nodes recover in parallel, we might have A recovered fully,
>>> while B C doesn't.
>>>
>>> pre_rw_eopch recovered_epoch
>>> A 1 3
>>> B 1 1
>>> C 1 1
>>>
>>> Then B, C need to iterate from pre_rw_epoch to hdr->tgt_epoch, instead of from
>>> recovered_epoch to hdr->tgt_epoch, to get the needed object list information.
>>
>> This is not correct. Note that new nodes can be added before
>> finishing recovery on all nodes.
>>
>> For example:
>>
>> 1. There is only one node A at epoch 1. Node A has one object 'obj'.
>>
>> pre_rw_oopch recovered_epoch epoch
>> A - - 1
>>
>> 2. Node B joins, and the store of 'obj' changes to node B. Node A
>> finishes recovery, but node B does not yet.
>>
>> pre_rw_epoch recovered_epoch epoch
>> A - 2 2
>> B - - 2
>>
>> 3. Node C joins, and node A finishes recovery at epoch 3 soon, but
>> node B does not finish recovery at epoch 2 yet.
>
>
> I doubt if it happens for real. In this case, A recovers successfully
> twice while B doesn't at all for a single recovery.
>
> If this happens for real, I think we do need to have some recovery
> information syncing between in nodes.
To be more precise, when node C joins as you describe, B have already
read objects of epoch 1 from A, I think.
Thanks,
Yuan
More information about the sheepdog
mailing list