[sheepdog] read/write during recovery

Liu Yuan namei.unix at gmail.com
Tue Jul 24 09:03:52 CEST 2012


On 07/24/2012 02:50 PM, Dietmar Maurer wrote:
>> On 07/24/2012 02:13 PM, Dietmar Maurer wrote:
>>>> Why can we simply reject read/writes until we start recovering a
>>>> specific object?
>>>
>>> Sorry, the question is:
>>>
>>> Why can't we simply reject read/writes until we start recovering a specific
>> object?
>>>
>>
>> What do you by 'reject'? We can't simply return EIO to Guest, that is why we
>> have wait queues, which re-queue the requests after some conditions meet.
> 
> That was the question - Why can't we reject? We already do:
> 
> 		if (is_recovery_init()) {
> 			req->rp.result = SD_RES_OBJ_RECOVERING;
> 
> so we already rely on gateway retry?

Yes, gateway is supposed to retry.

> 
> All we need to do is to log write request, and apply them later after object is
> recovered. IMHO, that would be much simpler than current code.
> 
>> Basically, there are two mechanism: 1) use wait queues to retry when
>> targeted object is being migrated/recovered 2) schedule objects that are
>> being requested with higher priority than those aren't.
> 
> My suggestion is to use a write journal for write during recovery. So writes
> simply succeed and there is no need for queue/schedule code.
> 

Where do you store write journal? If writes simply succeed with journal
stored only in one node, then what do you do if that node is permanently
down? We'll lose the data for sure and this case is non-recoverable.

>> Note, with consistent hashing algorithm, we actually have just very small set
>> of objects that are to be migrated/recovered, most of objects don't need to
>> be recovered, they just stay where they are. This means most of the requests
>> during configuration event will be serviced as normal.
> 
> I though that depends on the number of nodes. For example,
> if I have 3 nodes and copies=2, about 1/3 of all objects need to be recovered?
> 

Yes, this would be problem for small sized cluster.

Thanks,
Yuan




More information about the sheepdog mailing list