[sheepdog] reovery and consistency questions

Corin Langosch info at corinlangosch.com
Sun Feb 8 21:15:19 CET 2015


Hi guys,

I'm currently digging around in the sheepdog sources and have a few questions regarding recovery and object consistency.
Please correct me if I'm wrong in anything I write here - it's all just read together from various documents and source
files.

Sheepdog keeps track which nodes are alive at a given point in time in an epoch object. Every time a node joins/ leaves
the cluster a new epoch is genereated. A history of all epochs is kept.  Objects are mapped to nodes using consistend
hashing, the objects ec-chunks simply ordered to the neighbors nodes. Using the epoch history we can map the same object
to the same node for any past cluster state.

As for recovery, please consider the following cluster history and an object A (2:1 ec):

E Nodes                         Placement of chunks
1 []
- node1 joins
2 [node1]                       not enough nodes
- node2 joins
3 [node1, node2]                A1=node2,A3=node1
- node3 joins
4 [node1, node2, node3]         A1=node2,A2=node3,A3=node1
- node4 joins, A3 is moved to the its new place
5 [node1, node2, node3, node4]  A1=node2,A2=node3,A3=node4
- node4 crashes, A3 is recovered from A1+A2
6 [node1, node2, node3]         A1=node2,A2=node3,A3=node1
- whole cluster crashes
7 []
- node4 joins
8 [node4]                       A3=node4 (no access, not enough nodes)
- node3 joins
9 [node3, node4]                A3=node4,A2=node3 (access, but A3 is outdated!!!)

How do you prevent that the outdated version of A3 on node4 is used? The latest version of A3 is on node1 (epoch 6), but
how do we know this by only keeping track of the epochs? Afaik there's no central repository which holds all object/
chunk versions?

Thank you in advance :)

Corin



More information about the sheepdog mailing list