[Sheepdog] Sheepdog Read-only issues.
morita.kazutaka at lab.ntt.co.jp
Fri Apr 8 20:49:37 CEST 2011
At Thu, 7 Apr 2011 02:06:07 -0400,
Eric Renfro wrote:
> I just started using Sheepdog and I'm curious as to why this is occurring,
> if it's a known issue or a resolved issue.
> What's happening to me is, when one of my sheep servers are taken down, it
> causes server-wide issues, especially with running VM's. I have 6 sheep
> servers running on 6 physical computers. 4 of the servers run kvm guests
> along with sheep, 2 servers are just storage servers only. I currently run
> sheepdog through pacemaker as a primitive lsb resource in every sheep node.
> When I stop pacemaker on nas2 (a storage only server), vm's on vservers 1-4
> suddenly get I/O errors and the filesystems remount R/O and either won't
> restart properly on the same node and have to be migrated to another node,
> or they do. Either way the only way to restore access is by rebooting the
> guest vm outright. Each guest vm uses the localhost:7000 for sheep access to
> the sheepdog vdi's.
Thanks for your report! This is not a known issue, but I confirmed
that the sheep daemon could return EIO when the cluster membership is
changing. I'll dig into this problem soon.
> I'm running this platform all on OpenSUSE 11.4 with qemu 0.14.0 from
> standard opensuse repositores (not the virtualization repository) and
> reasonably current sheepdog git build.
> I setup the sheepdog collie cluster to maintain 3 copies as well.
> In another test, I had just 2 vservers running sheepdog with vm guests on
> the same 2, using only 2 copies, during my initial testing of sheepdog, and
> by crowbarring pacemaker into standby mode to test migration of the kvm
> sessions, it ended up destroying the sheepdog cluster completely loosing all
Could you explain more details about what you have done? What's the
configuration of your pacemaker? What are the commands you ran to
make pacemaker a standby mode and migrate your virtual machine? I'd
like to reproduce the problem.
> of the vdi's, and being unable to find a specific obj file it was looking
> for from the cluster so it kept trying endlessly. Ended up having to
> reformat the cluster, which is when I got my two storage servers rebuilt to
> handle 2 more sheep clusters and set it up to use 3 copies amongst 4
> servers, then finally the 2 other vservers were joined into the sheepdog
> cluster as a whole.
> Any information regarding this problem I'd be glad to hear it. So far it
> looks like Sheepdog is going to be very strong and powerful and meet my
> needs, as long as I can get around this current problem I have presently.
> Eric Renfro
> sheepdog mailing list
> sheepdog at lists.wpkg.org
More information about the sheepdog