At Sun, 24 Apr 2011 17:56:59 +0900, MORITA Kazutaka wrote: > > At Sun, 24 Apr 2011 01:14:36 -0500, > Greg Zapp wrote: > > > > If the sheep daemon were killed would VM's on node A keep running? That's the desired outcome. I dont want a single storage error to cause any VMs to go down... > > That's the final goal though not supported yet. The ideal behavior is > that when the sheep on node A is killed, the VM on node A reconnects > to the sheep on node B. In my plan, it will be done in the version > 0.4.0. > https://sourceforge.net/apps/trac/sheepdog/ticket/1 On second thought, we don't need reconnection in this case. What we need to do here is: - remove node A from a consistent hash ring - keep node A as a gateway node I've posted a patch to support this, and it will be included in the next version, 0.2.3. Thanks, Kazutaka > > > > > The vdi is found but the vm hangs during boot. > > Thank you. Currently, Sheepdog I/O is a bit unstable while node > membership changes, and I guess it caused data loss of your VM. I'll > fix them soon. > > Thanks, > > Kazutaka > > > > > > On Apr 23, 2011, at 11:54 PM, MORITA Kazutaka <morita.kazutaka at gmail.com> wrote: > > > > > Hi, > > > > > > Thanks for your feedbacks. > > > > > > At Sat, 23 Apr 2011 20:17:33 -0500, > > > Greg Zapp wrote: > > >> I have sheepdog running on two nodes. The sheepdog store is a single drive > > >> separate from the OS drive. I'm running an ubuntu VM on node A. If I force > > >> unmount the store, the VM craps itself. Shouldn't sheepdog be able to > > >> gracefully handle a drive failure or unmounted file system? > > > > > > Yes, it really should be handled... In this case, Sheepdog must do > > > the below automatically: > > > > > > - kill the sheep daemon on node A when the store is unavailable > > > - failover the VM connection to node B > > > > > > In particular, the first needs to be resolved as soon as possible > > > because one disk failure shouldn't affect the availability of total > > > system. > > > > > >> > > >> Update: > > >> > > >> In order to bring the VM up on node B, I had to kill sheep on node A. > > >> However, when I remounted the drive and brought sheep back up on node A it > > >> ruined everything. I shut down sheep on node A and cleared the store > > >> directory then brought it back up. After it synced I was still unable to > > >> boot the VM on node A or node B. It's hosed. > > > > > > This should be handled in the current version and seems to be a bug. > > > Unfortunately, I couldn't reproduce the problem in my environment. > > > Your VM couldn't even find the sheepdog volume? Or your VM found the > > > volume but hung during boot? > > > > > > > > > Thanks, > > > > > > Kazutaka |