[sheepdog] questions about sheepdog write policy
Hitoshi Mitake
mitake.hitoshi at gmail.com
Fri May 27 10:14:20 CEST 2016
On Thu, May 26, 2016 at 4:19 PM, Dong Wu <archer.wudong at gmail.com> wrote:
> Thanks for your reply.
>
> 2016-05-26 10:34 GMT+08:00 Hitoshi Mitake <mitake.hitoshi at gmail.com>:
> >
> >
> > On Tue, May 24, 2016 at 6:46 PM, Dong Wu <archer.wudong at gmail.com>
> wrote:
> >>
> >> hi,mitake
> >> I have questions about sheepdog write policy.
> >> for replication, sheepdog write default 3 copies, and is strong
> >> consistency.
> >> my doubt is
> >> 1) if some replicas write successfully, others fail, then it will
> >> retry write anyway until all the 3 replicas success? but if there are
> >> only less than 3 nodes left, will it write only less than 3 replicas
> >> and return success to client?
> >
> >
> > In a case of disk and network I/O error, sheep returns an error to its
> > client immediately. In some case (e.g. epoch increasing caused by node
> > join/leave), it will retry.
>
> will the client retry? If the error is caused by only one of the
> replica(eg, the replica's disk is error), and another two is ok, and
> writed success, then return to client error is reasonable? Why not
> just return to client success, and then recover the errored replica?
>
It is for ensuring 3 replica is consistent. Sheepdog's interface is virtual
disk so consistency is more important than availability.
>
> >
> >>
> >> 2) if some replicas write success, others write fail, and return fail
> >> to client, how to deal with these replicas's data consistency(write
> >> success node has new data, but write fail node has old data)? if
> >> client read the same block, will it read new data or old data?
> >
> >
> > In such a case, we need to repair consistency with "dog vdi check"
> command.
> > Note that in such a case the failed VDIs won't be accessed from VMs
> anymore
> > because they will be used in read-only mode.
>
> This meas can't read data from this VDI until it recover done?
> I remember in old version sheepdog, in the read I/O path, it first
> check the replicas's consistency, then read data;
> but i can't find the logic anymore in the lastest version.
>
The data can be read and actually it would work well in many cases.
I'm not sure about the feature of the old version, but it seems to be
costly for ordinal read path. But reviving it as an option would be a
reasonable. How do you think?
Thanks,
Hitoshi
>
> >
> > Thanks,
> > Hitoshi
> >
> >>
> >>
> >> Thanks a lot.
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20160527/c9e92803/attachment.html>
More information about the sheepdog
mailing list