On Mon, 27 Sep 2010 21:58:04 +0900 (JST) Hirokazu Takahashi <taka at valinux.co.jp> wrote: > > Yeah, but even real disk could return bogus data (that is, silent data > > corruption). So this issue (returning bogus data) is about the > > possibility. > > The modern disks and HBAs can detect bogus data in most cases, but You are talking about SCSI DIF or high-end storage systems that use checksumming internally? I'm not sure the modern SATA disk can detect such failure. > there are still possibilities. Yes. > > > In addition, as you know, the recent file systems can handle such > > failure. > > Yes, I know some filesystem got such a feature. But there is no point > to return bogus data instead of an EIO error. Yeah, but returning EIO in such cases makes an implementation more complicated. > > > VastSky updates all the mirrors synchronously. And only after all > > > the I/O requests are completed, it tells the owner that the request > > > is done. > > > > Undoerstood. Sheepdog works in the same way. > > > > How does Vastsky detect old data? > > > > If one of the mirror nodes is down (e.g. the node is too busy), > > Vastsky assigns a new node? > > Right. > Vastsky makes the node that seems to be down deleted from the group > and assigns a new one. Then, no one can access to the old one after that. How Vastsky stores the information of the group? For example, Vastsky assigns a new node, updates the data on all the replica nodes, and returns the success to the client, right after that, all nodes are down due to a power failure. After all the nodes boot up again, Vastsky can still detect the old data? > > Then if a client issues READ and all the > > mirrors are down but the old mirror node is alive, how does Vastsky > > prevent returning the old data? > > About this issue, we have a plan: > When a node is down and it's not because of a hardware error, > we will make VastSky try to re-synchronize the node again. Yeah, that's necessary especially each nodes has huge data. Sheepdog can do that. > This will be done in a few minutes because VastSky traces all write > I/O requests to know which sectors of the node aren't synchronized. How Vastsky stores the trace log safely (I guess that the trace log is saved on multiple hosts). Vastsky updates the log per WRITE request? > And you should know VastSky won't easily give up a node which seems > to be down. VastSky tries to reconnect the session and even tries to > use another path to access the node. Hmm, but it just means that a client I/O request takes long. Even if VastSky doesn't give up, a client (i.e. application) doesn't want to wait for long. -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html |