[sheepdog] Fixing method for data wipe bug in the recovery
3100100878 at zju.edu.cn
Sat Nov 29 10:44:31 CET 2014
We recently encounter the bug which was reported here:
I'm not sure if this is the root cause, but by changing the initial value of ret in function recover_object_from_replica(in recovery.c)
ret = SD_RES_SUCCESS
ret = SD_RES_NO_OBJ (actually anything but SD_RES_SUCCESS), i'm able to avoid the lose of data.
The reason is that, when the node try to recover it self to match the newest epoch, the nr_copies in function recover_object_from_replica is zero, so it return ret == SD_RES_SUCCESS while it actually skipped all the recover function in the loop. I don't know why nr_copies is zero though.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the sheepdog