[Sheepdog] Drive Failure

MORITA Kazutaka morita.kazutaka at gmail.com
Sun Apr 24 06:54:09 CEST 2011


Hi,

Thanks for your feedbacks.

At Sat, 23 Apr 2011 20:17:33 -0500,
Greg Zapp wrote:
> I have sheepdog running on two nodes.  The sheepdog store is a single drive
> separate from the OS drive.  I'm running an ubuntu VM on node A.  If I force
> unmount the store, the VM craps itself.  Shouldn't sheepdog be able to
> gracefully handle a drive failure or unmounted file system?

Yes, it really should be handled...  In this case, Sheepdog must do
the below automatically:

  - kill the sheep daemon on node A when the store is unavailable
  - failover the VM connection to node B

In particular, the first needs to be resolved as soon as possible
because one disk failure shouldn't affect the availability of total
system.

> 
> Update:
> 
> In order to bring the VM up on node B, I had to kill sheep on node A.
> However, when I remounted the drive and brought sheep back up on node A it
> ruined everything.  I shut down sheep on node A and cleared the store
> directory then brought it back up.  After it synced I was still unable to
> boot the VM on node A or node B.  It's hosed.

This should be handled in the current version and seems to be a bug.
Unfortunately, I couldn't reproduce the problem in my environment.
Your VM couldn't even find the sheepdog volume?  Or your VM found the
volume but hung during boot?


Thanks,

Kazutaka



More information about the sheepdog mailing list