[sheepdog] [PATCH 1/2] sheep/recovery: get recovering local object right

Liu Yuan namei.unix at gmail.com
Tue Mar 3 03:22:53 CET 2015


On Mon, Mar 02, 2015 at 04:02:37PM +0800, Liu Yuan wrote:
> From: Liu Yuan <liuyuan at cmss.chinamobile.com>
> 
> One of the acceleration for recovery is we try to recover the object from local
> node as much as possible. It is straightforward implemented:
> 
> 1 firstly get the hash of the object to be recoveried from stale directry if any
> 2 then compare the fingerprint to the remote node
> 3 if identical, then we can safely recover it from local stale directory.
> 
> But this logic is never executed in the following case:
> 
> 0 sheep try to recover object A at from epoch 5, we note it as A.5
> 1 but sheep find we have a local copy A.2 due to a multiple node events
> 2 then sheep get the fingerprint of A.2 and then compare to remote node.
> 3 the figerprints are identical, so this sheep tries to recover it from A.2
> 4 if, unfortunately, A.5 is as well calcuated onto this node, even though this
>   sheep dosen't have it, our code will first try to link A.5
> 5 unfortunately, A.5 is never out there and before we really try to link A.2,
>   sheep fail out because ->link(A.5) return error.
> 
> The fix is easy, just try to ->link(A.2) before ->link(A.5).

Hitoshi?...ping...



More information about the sheepdog mailing list