[sheepdog] [PATCH v1 1/2] sheep: fix error in sheepdog cluster recovery

Liu Yuan namei.unix at gmail.com
Thu Feb 13 10:38:14 CET 2014


On Thu, Feb 13, 2014 at 05:23:06PM +0800, Robin Dong wrote:
> From: Robin Dong <sanbai at taobao.com>
> 
> Sheepdog failed to recover object when we running it on 5 servers cluster with
> about 20G data by erasure-code mode.
> 
> The reason is in default_create_and_write(): it rename() obj to data-directory
> and then set xattr of ec-index for it, this will leave a time-window for another
> process to read the data-object but can't get xattr of ec-index. Then the
> process will report get-xattr fail and remove the disk as it think it's an
> io-error event.
> 

Good catch, applied thanks

Yuan


More information about the sheepdog mailing list