[sheepdog] [PATCH 9/9] sheep: show error message when object may be lost
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Tue May 7 09:43:09 CEST 2013
At Tue, 07 May 2013 15:13:12 +0800,
Liu Yuan wrote:
>
> + case SD_RES_NO_OBJ:
> + /*
> + * No object means that there was no write success at
> + * this epoch.
> + */
> + data_lost = false;
> + /* fall through */
>
> So if A, B, C all return SD_RES_NO_OBJ, you set data_lost = false, in
> this case, we don't print an error, no?
I set false to data_lost even when only one of nodes returns
SD_RES_NO_OBJ.
Write requests are successful only when all the replicas are updated.
This means that if there is a node who returns SD_RES_NO_OBJ, we can
guarantee that no write requests were succeeded at the epoch and we
can safely use the older replicas.
For example,
Epoch Nodes
1 [A, B, C, D] <- A, B, and C has the object X.
2 [A, B, C, D, E] <- B, C, and E are in charge of X, but E doesn't recover
X yet.
3 [A, C, D, E]
4 [A, D, E] <- B and C have gone away at epoch 2
In this case,
- A tries to recover X from C, D, and E at epoch 3 first, but no
object is recovered at epoch 3. C, D, and E return SD_RES_NO_OBJ
and we can safely try the older epoch.
- A tries to recover X from B, C, and E at epoch 2. A cannot connect
to B and C, and E returns SD_RES_NO_OBJ. In this case, no need to
consider that X was updated at epoch 2 because if it was updated
from X to X', E must have X'.
- Now A can safely read X from A, B, or C at epoch 1.
Thanks,
Kazutaka
More information about the sheepdog
mailing list