[sheepdog-users] strange behavior when deconfiguring a nic

Valerio Pachera sirio81 at gmail.com
Fri Dec 21 10:32:15 CET 2012


I posted this isssue before but I think it deserves a thread it self.

*If I deconfigure the nic of a node, it's impossible to get back the
node in the cluster.*

On the 3th node of my cluster:
  ifdwon eth0
or
  ip addr del 192.168.2.43/24 dev eth0

Once restored the ip, I run sheep daemon again.
It starts and in sheep.log you can see this:
  Dec 21 10:17:18 [main] jrnl_recover(230) opening the directory
/mnt/sheepdog/journal/
  Dec 21 10:17:18 [main] jrnl_recover(235) starting journal recovery
  Dec 21 10:17:18 [main] jrnl_recover(291) journal recovery complete
  Dec 21 10:17:18 [main] send_join_request(1014) IPv4 ip:192.168.2.43 port:7000
  Dec 21 10:17:19 [main] main(616) sheepdog daemon (version
0.5.5_23_gfacdf48) started
  Dec 21 10:17:21 [main] update_cluster_info(799) status = 4, epoch =
8, finished: 0

Node 1 and node 2 don't show node3 by 'cluster info'.
Node 3 shows all nodes by 'cluster info' (the state of the cluster
before it was going off)
The node recovery do not start.

To kill and restart sheep daemon doesn't help.
I also tried to restart corosync without luck.
*After a reboot of the node, the sheep daemon is able to join the cluster!*

More details in the attached file.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster_info.log
Type: application/octet-stream
Size: 726 bytes
Desc: not available
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20121221/7609a1d1/attachment-0003.obj>


More information about the sheepdog-users mailing list