2011/8/30 Keiichi SHIMA <shima at wide.ad.jp>: > 5. at some point, the sheep cluster stop working. > 'collie vdi list' is start showing errors ('failed to read a inode header ...') > 'collie node info' is start showing errors (the same message as above) Few days ago I noticed that problem too # collie vdi list name id size used shared creation time vdi id ------------------------------------------------------------------ failed to read a inode header 10701927, 0, 42 On a small 3 noce cluster. A vdi disk has been created when all 3 nodes were on. I shuted off node3, then I got the error. Anyway,the cluster didn't stop and completed the sync. The error message was then not shown anymore. I tryed to shutoff node3 again and this time, no error message. I download the latest sheepdog and see if I get any error message agging/removing nodes. > What is the right way to remove a node from a cluster? I'm wondering the same. What happens if I kill all sheep processes but corosync (and wait several minutes or even hours)? And viceversa? Probably, the best way is to kill all sheep processes and corosync a soon as possible. pkill sheep; /etc/init.d/corosync stop |