[sheepdog] read/write during recovery

Dietmar Maurer dietmar at proxmox.com
Wed Jul 25 22:20:31 CEST 2012


> For example:
> 
>  1. There are two node, A and B.
>  2. Node C joins Sheepdog, and journal data is written on node C until
>     it finishes recovery.
>  3. If node D joins Sheepdog before Node C finishes recovery, the node
>     reads actual data from node A and B, and journal data from node C.
>     At the same time, node C also needs to write journal data in local
>     to handle write requests.
>  4. If node E joins Sheepdog before node C and D finish recovery, node
>     E needs to read journal data from node C and D.  Node E needs to
>     know which journal is newer to apply journal in the correct order.

The real problem is that sheepdog change node mapping as soon
as a new node joins. For me, it seems safer to keep the current
mapping until all new nodes are in sync.

One can implement that by tracking the node status together with epoch.
A node can be DOWN, UP (but not synced), and UP_SYNCED.

During writes, we consider 2 mappings. One only using UP_SYNCED nodes, the second
consider UP and UP_SYNCED nodes. We write to all those nodes. For
reads we only consider nodes in status UP.

That would avoid above error case?





More information about the sheepdog mailing list