[sheepdog] read/write during recovery

Wed Jul 25 09:42:52 CEST 2012

At Wed, 25 Jul 2012 06:04:57 +0000,
Dietmar Maurer wrote:
> 
> >> My naïve patch looks like this (can be optimized further):
> 
> >IIUC, your patch does not handle write requests because write
> >journaling is not implemented yet, yes?  I think it is not easy to
> >implement journaling across nodes.  Do you have any ideas to implement
> >it simply?
> 
> The idea is to simply discard those write request. We can do that, because 
> there is at least one node which has data locally, and that node applies all 
> writes (we sync data from that node later).

How do you handle the following case?

 1. There are two node A and B (redundancy level is 2), and each node
    has one object.
 2. Node C joins Sheepdog, and new placement of the object becomes
    node B and C.
 3. A VM writes data to the object, and node B completes the request
    but node C rejects it since recovery is not started.
 4. Node B crashes before node C gets the updated data from node B,
    and then the written data will be lost even though only one node
    fails.  In addtion, the VM can reads the old object after the
    failure, which breaks the block device semantics.

Thanks,

Kazutaka

> 
> The only problem is a race condition, because gateway node does not 
> update all copies at the same time. But we can solve that by other means.
> 
> - Dietmar
>