[sheepdog-users] sheep doesn't join the cluster on reboot
Liu Yuan
namei.unix at gmail.com
Fri Jan 11 15:16:47 CET 2013
On 01/11/2013 10:00 PM, Valerio Pachera wrote:
> Hi, I've been able to reproduce this problem:
>
> I have a mode that was out of the cluster since long time.
> I choose to empty its data folder before join the cluster again.
> I didn't run the daemon on a shell, but set
> /usr/sbin/sheep /mnt/sheepdog
> In
> /etc/rc.local
> and reboot the node.
>
> The daemon starts succesfully but it doesn't join the cluster.
>
> Node1
> root at sheepdog001:~# collie node list
> M Id Host:Port V-Nodes Zone
> - 0 192.168.2.41:7000 64 688040128
>
> Node2
> root at sheepdog002:~# collie node list
> M Id Host:Port V-Nodes Zone
> - 0 192.168.2.42:7000 57 704817344
> - 1 192.168.2.43:7000 106 721594560
> - 2 192.168.2.215:7000 29 -687691584
>
> root at sheepdog001:~# more /mnt/sheepdog/sheep.log
> Jan 11 14:41:34 [main] send_join_request(1014) IPv4 ip:192.168.2.41 port:7000
> Jan 11 14:41:34 [main] main(616) sheepdog daemon (version
> 0.5.5_36_g6101dbe) started
> Jan 11 14:41:34 [main] update_cluster_info(799) status = 2, epoch = 0,
> finished: 0
>
> I also tried to postponde the daemon startup:
> sleep 30; /usr/sbin/sheep /mnt/sheepdog
> but it doesn't change.
>
> If I reboot the node and then run
> sheep /mnt/sheepdog
> the node joins the cluster correctly.
>
> sheep -h 0.5.5_36_g6101dbe
> Debian wheezy 3.2.0-4-amd64
>
I think this would be more a corosync bug. You can try to check
/var/log/syslog to check corosync log (compare good and bad cases).
Thanks,
Yuan
More information about the sheepdog-users
mailing list