[Sheepdog] unstable behavior when nodes join or leave
Valerio Pachera
sirio81 at gmail.com
Fri Sep 2 14:33:42 CEST 2011
2011/8/30 Keiichi SHIMA <shima at wide.ad.jp>:
> 5. at some point, the sheep cluster stop working.
> 'collie vdi list' is start showing errors ('failed to read a inode header ...')
> 'collie node info' is start showing errors (the same message as above)
Few days ago I noticed that problem too
# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
failed to read a inode header 10701927, 0, 42
On a small 3 noce cluster.
A vdi disk has been created when all 3 nodes were on.
I shuted off node3, then I got the error.
Anyway,the cluster didn't stop and completed the sync. The error
message was then not shown anymore.
I tryed to shutoff node3 again and this time, no error message.
I download the latest sheepdog and see if I get any error message
agging/removing nodes.
> What is the right way to remove a node from a cluster?
I'm wondering the same.
What happens if I kill all sheep processes but corosync (and wait
several minutes or even hours)? And viceversa?
Probably, the best way is to kill all sheep processes and corosync a
soon as possible.
pkill sheep; /etc/init.d/corosync stop
More information about the sheepdog
mailing list