[sheepdog] [PATCH 2/2] collie: add a check&repair command
Christoph Hellwig
hch at infradead.org
Mon Jun 25 12:51:16 CEST 2012
On Wed, Jun 20, 2012 at 06:16:02PM +0800, Liu Yuan wrote:
> "With the following scenarios, object replicas could have the different
> contents:
>
> - a gateway node fails while forwarding write requests
> - total node failure happens while writing objects
>
> In the such cases, it is okay for VMs not to read the latest data from
> the inconsistent objects because the VMs received EIO from them
> before. However, it is still needed to fix the objects' inconsistency
> so that the VMs won't read the different data from the objects next
> time."
>
> So when those two case happens, uesrs are expected to run:
>
> $ collie check affected_vdi_name
Requiring manual user intervention when a node goes down in a
distributed storage system is entirely unacceptable. I'm happy to kill
the dumb version of the consistency fix, but in exchange sheepdog needs
to have a better internal method to deal with this failure instead of
bailing out.
More information about the sheepdog
mailing list