[Sheepdog] create_cluster(1652) Failed to join the sheepdog group, try again
Steven Dake
sdake at redhat.com
Tue Sep 14 21:28:45 CEST 2010
On 09/12/2010 08:30 AM, Narendra Prasad Madanapalli wrote:
> Hi Kazutaka,
>
> Surprisingly it works today as corosync -f won't throw any errors.
> However, ping6 throws the folowing error
>
> # ping6 fe80::21e:4cff:fe59:9936
> connect: Invalid argument
>
>
You may have a configuration problem with your interfaces not setting up
routes properly.
Regards
-steve
> =====corosync.log
> Sep 13 02:20:58 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'):
> started and ready to provide service.
> Sep 13 02:20:58 corosync [MAIN ] Corosync built-in features: nss rdma
> Sep 13 02:20:58 corosync [MAIN ] Successfully read main configuration
> file '/etc/corosync/corosync.conf'.
> Sep 13 02:20:58 corosync [TOTEM ] Initializing transport (UDP/IP).
> Sep 13 02:20:58 corosync [TOTEM ] Initializing transmit/receive
> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Sep 13 02:20:58 corosync [TOTEM ] The network interface
> [fe80::21e:4cff:fe59:9936] is now up.
> Sep 13 02:20:58 corosync [SERV ] Service engine loaded: corosync
> extended virtual synchrony service
> Sep 13 02:20:58 corosync [SERV ] Service engine loaded: corosync
> configuration service
> Sep 13 02:20:58 corosync [SERV ] Service engine loaded: corosync
> cluster closed process group service v1.01
> Sep 13 02:20:58 corosync [SERV ] Service engine loaded: corosync
> cluster config database access v1.01
> Sep 13 02:20:58 corosync [SERV ] Service engine loaded: corosync
> profile loading service
> Sep 13 02:20:58 corosync [SERV ] Service engine loaded: corosync
> cluster quorum service v0.1
> Sep 13 02:20:58 corosync [MAIN ] Compatibility mode set to whitetank.
> Using V1 and V2 of the synchronization engine.
> Sep 13 02:20:58 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Sep 13 02:20:58 corosync [MAIN ] Completed service synchronization,
> ready to provide service.
> Sep 13 02:21:32 corosync [SERV ] Unloading all Corosync service engines.
> Sep 13 02:21:32 corosync [SERV ] Service engine unloaded: corosync
> extended virtual synchrony service
> Sep 13 02:21:32 corosync [SERV ] Service engine unloaded: corosync
> configuration service
> Sep 13 02:21:32 corosync [SERV ] Service engine unloaded: corosync
> cluster closed process group service v1.01
> Sep 13 02:21:32 corosync [SERV ] Service engine unloaded: corosync
> cluster config database access v1.01
> Sep 13 02:21:32 corosync [SERV ] Service engine unloaded: corosync
> profile loading service
> Sep 13 02:21:32 corosync [SERV ] Service engine unloaded: corosync
> cluster quorum service v0.1
> Sep 13 02:21:32 corosync [MAIN ] Corosync Cluster Engine exiting with
> status 0 at main.c:170.
> Sep 13 02:23:27 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'):
> started and ready to provide service.
> Sep 13 02:23:27 corosync [MAIN ] Corosync built-in features: nss rdma
> Sep 13 02:23:27 corosync [MAIN ] Successfully read main configuration
> file '/etc/corosync/corosync.conf'
> ==============
>
>
> Thanks,
> Narendra.
>
> On Mon, Sep 6, 2010 at 1:36 PM, MORITA Kazutaka
> <morita.kazutaka at lab.ntt.co.jp> wrote:
>> At Sat, 4 Sep 2010 11:36:39 +0530,
>> Narendra Prasad Madanapalli wrote:
>>>
>>> Hi Steve,
>>>
>>> Please find below the output of ifconfig and the contents of corosync.log
>>>
>>> ===========ifconfig
>>> [nlakn at naninf ~]$ ifconfig
>>> eth0 Link encap:Ethernet HWaddr 00:1B:24:69:92:11
>>> UP BROADCAST MULTICAST MTU:1500 Metric:1
>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:1000
>>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>>> Interrupt:27 Base address:0xa000
>>>
>>> lo Link encap:Local Loopback
>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>> inet6 addr: ::1/128 Scope:Host
>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>> RX packets:8 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:0
>>> RX bytes:480 (480.0 b) TX bytes:480 (480.0 b)
>>>
>>> virbr0 Link encap:Ethernet HWaddr 22:83:14:EC:B9:66
>>> inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:0
>>> RX bytes:0 (0.0 b) TX bytes:3925 (3.8 KiB)
>>>
>>> wlan0 Link encap:Ethernet HWaddr 00:1E:4C:59:99:36
>>> inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
>>> inet6 addr: fe80::21e:4cff:fe59:9936/64 Scope:Link
>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>>> RX packets:1671 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:1555 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:1000
>>> RX bytes:1724236 (1.6 MiB) TX bytes:285862 (279.1 KiB)
>>> ======================================================================
>>>
>>> =====corosync.log
>>> Sep 03 23:17:39 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'):
>>> started and ready to provide service.
>>> Sep 03 23:17:39 corosync [MAIN ] Corosync built-in features: nss rdma
>>> Sep 03 23:17:39 corosync [MAIN ] Successfully read main configuration
>>> file '/etc/corosync/corosync.conf'.
>>> Sep 03 23:17:39 corosync [TOTEM ] Initializing transport (UDP/IP).
>>> Sep 03 23:17:39 corosync [TOTEM ] Initializing transmit/receive
>>> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>> Sep 03 23:17:39 corosync [TOTEM ] Could not set traffic priority.
>>> (Socket operation on non-socket)
>>
>> It seems that corosync couldn't create the socket for some reason.
>>
>> Can you try 'corosync -f' to run corosync in foreground? Corosync
>> uses perror for printing some socket errors, so the output may tell us
>> the error reason.
>>
>>
>> Thanks
>>
>> Kazutaka
>>
>>
>>> Sep 03 23:17:39 corosync [TOTEM ] The network interface
>>> [fe80::21e:4cff:fe59:9936] is now up.
>>> Sep 03 23:17:39 corosync [SERV ] Service engine loaded: corosync
>>> extended virtual synchrony service
>>> Sep 03 23:17:39 corosync [SERV ] Service engine loaded: corosync
>>> configuration service
>>> Sep 03 23:17:39 corosync [SERV ] Service engine loaded: corosync
>>> cluster closed process group service v1.01
>>> Sep 03 23:17:39 corosync [SERV ] Service engine loaded: corosync
>>> cluster config database access v1.01
>>> Sep 03 23:17:39 corosync [SERV ] Service engine loaded: corosync
>>> profile loading service
>>> Sep 03 23:17:39 corosync [SERV ] Service engine loaded: corosync
>>> cluster quorum service v0.1
>>> Sep 03 23:17:39 corosync [MAIN ] Compatibility mode set to whitetank.
>>> Using V1 and V2 of the synchronization engine.
>>> ==================================
>>>
>>>
>>> I observed that computer is too slow to keyboard& mouse events when
>>> corosync is started with IPv6 bindaddr.
>>>
>>>
>>> Thanks,
>>> Narendra.
>>>
>>> On Sat, Sep 4, 2010 at 2:36 AM, Steven Dake<sdake at redhat.com> wrote:
>>>> Perhaps your ipv6 interface isn't setup properly or for some reason corosync
>>>> can't bind to it or the multicast address. Can you attach
>>>> /var/log/cluster/corosync.log and output of ifconfig?
>>>>
>>>> Thanks
>>>> -steve
>>>>
>>>> On 09/03/2010 12:36 PM, Narendra Prasad Madanapalli wrote:
>>>>>
>>>>> Thanks Steve. It works on Fedora13 after disabling selinux/firewall. A
>>>>> similar kind of problem I encounter when corosync is started by
>>>>> specifying IPv6 addr in corosync.conf file as follows:
>>>>>
>>>>> =======corosync.conf
>>>>> compatibility: whitetank
>>>>>
>>>>> totem {
>>>>> version: 2
>>>>> secauth: off
>>>>> threads: 0
>>>>> nodeid: 1
>>>>> interface {
>>>>> ringnumber: 0
>>>>> nodeid: 1
>>>>> bindnetaddr: fe80::21e:4cff:fe59:9936
>>>>> mcastaddr: ff05::1
>>>>> mcastport: 5405
>>>>> }
>>>>> }
>>>>>
>>>>> logging {
>>>>> fileline: off
>>>>> to_stderr: no
>>>>> to_logfile: yes
>>>>> to_syslog: yes
>>>>> logfile: /var/log/cluster/corosync.log
>>>>> debug: off
>>>>> timestamp: on
>>>>> logger_subsys {
>>>>> subsys: AMF
>>>>> debug: off
>>>>> }
>>>>> }
>>>>>
>>>>> amf {
>>>>> mode: disabled
>>>>>
>>>>> ===================
>>>>>
>>>>> Corosync started successfully but sheepdog throws the same 'try again'
>>>>> errors in sheepdog.log. I ensure ip6tables are stopped before starting
>>>>> shepdog. Here, I am trying to fix addr_to_str() to support for IPv6
>>>>> addresses. I would apreciate if you can provide pointers to overcome
>>>>> this error
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Narendra.
>>>>>
>>>>> On Wed, Aug 11, 2010 at 9:47 PM, Steven Dake<sdake at redhat.com> wrote:
>>>>>>
>>>>>> On 08/11/2010 09:10 AM, Narendra Prasad Madanapalli wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I encounter mentioned error when sheep is started.
>>>>>>>
>>>>>>> I would appreciate if someone can help me to overcome these issues.
>>>>>>>
>>>>>>> Here is the details of corosync& sheepdog:
>>>>>>>
>>>>>>> OS Distro: FC11
>>>>>>>
>>>>>>> Corosync:
>>>>>>> corosynclib-devel-1.2.3-1.fc11.i586
>>>>>>> corosync-1.2.3-1.fc11.i586
>>>>>>> corosynclib-1.2.3-1.fc11.i586
>>>>>>>
>>>>>>
>>>>>> You may have iptables enabled which blocks corosync from executing.
>>>>>> Another
>>>>>> common problem is selinux is enabled, which only works well on newer
>>>>>> fedora
>>>>>> versions.
>>>>>>
>>>>>> Regards
>>>>>> -steve
>>>>>>
>>>>>>> Corosync log contents when it is started:
>>>>>>> Aug 11 09:29:36 corosync [MAIN ] Corosync Cluster Engine ('1.2.3'):
>>>>>>> started and ready to provide service.
>>>>>>> Aug 11 09:29:36 corosync [MAIN ] Corosync built-in features: nss rdma
>>>>>>> Aug 11 09:29:36 corosync [MAIN ] Successfully read main configuration
>>>>>>> file '/etc/corosync/corosync.conf'.
>>>>>>> Aug 11 09:29:36 corosync [TOTEM ] Initializing transport (UDP/IP).
>>>>>>> Aug 11 09:29:36 corosync [TOTEM ] Initializing transmit/receive
>>>>>>> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>>>>>> Aug 11 09:29:36 corosync [TOTEM ] The network interface
>>>>>>> [192.168.122.1] is now up.
>>>>>>> Aug 11 09:29:36 corosync [SERV ] Service engine loaded: corosync
>>>>>>> extended virtual synchrony service
>>>>>>> Aug 11 09:29:36 corosync [SERV ] Service engine loaded: corosync
>>>>>>> configuration service
>>>>>>> Aug 11 09:29:36 corosync [SERV ] Service engine loaded: corosync
>>>>>>> cluster closed process group service v1.01
>>>>>>> Aug 11 09:29:36 corosync [SERV ] Service engine loaded: corosync
>>>>>>> cluster config database access v1.01
>>>>>>> Aug 11 09:29:36 corosync [SERV ] Service engine loaded: corosync
>>>>>>> profile loading service
>>>>>>> Aug 11 09:29:36 corosync [SERV ] Service engine loaded: corosync
>>>>>>> cluster quorum service v0.1
>>>>>>> Aug 11 09:29:36 corosync [MAIN ] Compatibility mode set to whitetank.
>>>>>>> Using V1 and V2 of the synchronization engine.
>>>>>>>
>>>>>>>
>>>>>>> corosync.conf:
>>>>>>> # cat /etc/corosync/corosync.conf
>>>>>>> # Please read the corosync.conf.5 manual page
>>>>>>> compatibility: whitetank
>>>>>>>
>>>>>>> totem {
>>>>>>> version: 2
>>>>>>> secauth: off
>>>>>>> threads: 0
>>>>>>> interface {
>>>>>>> ringnumber: 0
>>>>>>> bindnetaddr: 192.168.122.1
>>>>>>> mcastaddr: 226.94.1.1
>>>>>>> mcastport: 5405
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> logging {
>>>>>>> fileline: off
>>>>>>> to_stderr: yes
>>>>>>> to_logfile: yes
>>>>>>> to_syslog: yes
>>>>>>> logfile: /tmp/corosync.log
>>>>>>> debug: off
>>>>>>> timestamp: on
>>>>>>> logger_subsys {
>>>>>>> subsys: AMF
>>>>>>> debug: off
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> amf {
>>>>>>> mode: disabled
>>>>>>> }
>>>>>>>
>>>>>>> sheepdog.log:
>>>>>>> Aug 11 09:48:05 worker_routine(215) started this thread 60
>>>>>>> Aug 11 09:48:05 worker_routine(215) started this thread 61
>>>>>>> Aug 11 09:48:05 worker_routine(215) started this thread 62
>>>>>>> Aug 11 09:48:05 worker_routine(215) started this thread 63
>>>>>>> Aug 11 09:48:06 create_cluster(1652) Failed to join the sheepdog
>>>>>>> group, try again
>>>>>>> Aug 11 09:48:07 create_cluster(1652) Failed to join the sheepdog
>>>>>>> group, try again
>>>>>>> Aug 11 09:48:08 create_cluster(1652) Failed to join the sheepdog
>>>>>>> group, try again
>>>>>>> Aug 11 09:48:09 create_cluster(1652) Failed to join the sheepdog
>>>>>>> group, try again
>>>>>>> Aug 11 09:48:10 create_cluster(1652) Failed to join the sheepdog
>>>>>>> group, try again
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Narendra.
>>>>>>
>>>>>>
>>>>
>>>>
>>> --
>>> sheepdog mailing list
>>> sheepdog at lists.wpkg.org
>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>
More information about the sheepdog
mailing list