[sheepdog] [PATCH] sheep: handle recovery request in check_request_in_recovery()

Liu Yuan namei.unix at gmail.com
Sat Jun 2 17:09:53 CEST 2012


On 06/02/2012 10:57 PM, Christoph Hellwig wrote:

> I need to add a printout of the error value in the
> recover_object_from_replica failure case, but I suspect we keep getting
> SD_RES_OBJ_RECOVERING back from the target sheep, and then move back
> to older versions.  I guess do_recover_object simply needs to handle
> SD_RES_OBJ_RECOVERING special, e.g. by trying other recovery first but
> going back to until we get a different return value.


Then why old master passes this test. It seems that old master passes
just because it wastes time on this case by trying to read non-exist
object on one node, which gives other nodes enough time to recover the
objects meanwhile. Kind of timing problem.

Well, I don't think we need special handling for this case, if other
sheep can recover the targeted object, why can't this unfortunate one?
Our recovery algorithm should assure to find the object if it exists
either in working directory or snap cache of Farm.

Thanks,
Yuan



More information about the sheepdog mailing list