[sheepdog] [PATCH] make recovery not to retry when recover_object_from_replica() fail

Liu Yuan namei.unix at gmail.com
Wed May 30 09:11:27 CEST 2012


On 05/30/2012 12:04 PM, levin li wrote:

> Since we make sheep to wait to retry when epoch is inconsistent,
> recover_object_from_replica() will never get a response with
> SD_RES_NEW_NODE_VER, because the peer node will retry the request
> itself locally until epoch gets consistent.
> 
> If epoch of request sender is old than the receiver, it would get
> SD_RES_OLD_NODE_VER, in this case, it means the epoch it's to increment
> and soon a new recovery work would replace the current one, we should
> not waste time recovering for the out-of-date recovery work, what we
> should do is to make the current recovery work cease to wait for replacement.
> 
> As for SD_RES_NETWORK_ERROR, currently, recover_object_from_replica() will
> get SD_RES_NEWWORK_ERROR only if there's an EIO when reading the object,
> in this case we should not make recovery retry, because next time it may
> get an EIO either and so that make the recovery work hang there retrying
> constantly, we should make it retry another copies or in another epoch.


Applied, thanks.

Yuan



More information about the sheepdog mailing list