[sheepdog] read/write during recovery
Dietmar Maurer
dietmar at proxmox.com
Wed Jul 25 10:01:56 CEST 2012
> > >> My naïve patch looks like this (can be optimized further):
> >
> > >IIUC, your patch does not handle write requests because write
> > >journaling is not implemented yet, yes? I think it is not easy to
> > >implement journaling across nodes. Do you have any ideas to
> > >implement it simply?
> >
> > The idea is to simply discard those write request. We can do that,
> > because there is at least one node which has data locally, and that
> > node applies all writes (we sync data from that node later).
>
> How do you handle the following case?
>
> 1. There are two node A and B (redundancy level is 2), and each node
> has one object.
> 2. Node C joins Sheepdog, and new placement of the object becomes
> node B and C.
> 3. A VM writes data to the object, and node B completes the request
> but node C rejects it since recovery is not started.
> 4. Node B crashes before node C gets the updated data from node B,
> and then the written data will be lost even though only one node
> fails. In addtion, the VM can reads the old object after the
> failure, which breaks the block device semantics.
Sure, If all nodes with actual data crash you have a problem. So sheepdog
tries to store data ASAP to make that unlikely? I guess I got it now ;-)
But using a journal for writes (during recovery) is still a good idea, because
- no delays on write when in recovery mode
- use less memory
what do you think?
More information about the sheepdog
mailing list