On Wed, Jun 20, 2012 at 06:16:02PM +0800, Liu Yuan wrote: > "With the following scenarios, object replicas could have the different > contents: > > - a gateway node fails while forwarding write requests > - total node failure happens while writing objects > > In the such cases, it is okay for VMs not to read the latest data from > the inconsistent objects because the VMs received EIO from them > before. However, it is still needed to fix the objects' inconsistency > so that the VMs won't read the different data from the objects next > time." > > So when those two case happens, uesrs are expected to run: > > $ collie check affected_vdi_name Requiring manual user intervention when a node goes down in a distributed storage system is entirely unacceptable. I'm happy to kill the dumb version of the consistency fix, but in exchange sheepdog needs to have a better internal method to deal with this failure instead of bailing out. |