[Sheepdog] create_cluster(1652) Failed to join the sheepdog group, try again

Narendra Prasad Madanapalli narendramind at gmail.com
Sun Sep 12 17:30:33 CEST 2010


Hi Kazutaka,

Surprisingly it works today as corosync -f won't throw any errors.
However, ping6 throws the folowing error

# ping6 fe80::21e:4cff:fe59:9936
connect: Invalid argument


=====corosync.log
Sep 13 02:20:58 corosync [MAIN  ] Corosync Cluster Engine ('1.2.7'):
started and ready to provide service.
Sep 13 02:20:58 corosync [MAIN  ] Corosync built-in features: nss rdma
Sep 13 02:20:58 corosync [MAIN  ] Successfully read main configuration
file '/etc/corosync/corosync.conf'.
Sep 13 02:20:58 corosync [TOTEM ] Initializing transport (UDP/IP).
Sep 13 02:20:58 corosync [TOTEM ] Initializing transmit/receive
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Sep 13 02:20:58 corosync [TOTEM ] The network interface
[fe80::21e:4cff:fe59:9936] is now up.
Sep 13 02:20:58 corosync [SERV  ] Service engine loaded: corosync
extended virtual synchrony service
Sep 13 02:20:58 corosync [SERV  ] Service engine loaded: corosync
configuration service
Sep 13 02:20:58 corosync [SERV  ] Service engine loaded: corosync
cluster closed process group service v1.01
Sep 13 02:20:58 corosync [SERV  ] Service engine loaded: corosync
cluster config database access v1.01
Sep 13 02:20:58 corosync [SERV  ] Service engine loaded: corosync
profile loading service
Sep 13 02:20:58 corosync [SERV  ] Service engine loaded: corosync
cluster quorum service v0.1
Sep 13 02:20:58 corosync [MAIN  ] Compatibility mode set to whitetank.
 Using V1 and V2 of the synchronization engine.
Sep 13 02:20:58 corosync [TOTEM ] A processor joined or left the
membership and a new membership was formed.
Sep 13 02:20:58 corosync [MAIN  ] Completed service synchronization,
ready to provide service.
Sep 13 02:21:32 corosync [SERV  ] Unloading all Corosync service engines.
Sep 13 02:21:32 corosync [SERV  ] Service engine unloaded: corosync
extended virtual synchrony service
Sep 13 02:21:32 corosync [SERV  ] Service engine unloaded: corosync
configuration service
Sep 13 02:21:32 corosync [SERV  ] Service engine unloaded: corosync
cluster closed process group service v1.01
Sep 13 02:21:32 corosync [SERV  ] Service engine unloaded: corosync
cluster config database access v1.01
Sep 13 02:21:32 corosync [SERV  ] Service engine unloaded: corosync
profile loading service
Sep 13 02:21:32 corosync [SERV  ] Service engine unloaded: corosync
cluster quorum service v0.1
Sep 13 02:21:32 corosync [MAIN  ] Corosync Cluster Engine exiting with
status 0 at main.c:170.
Sep 13 02:23:27 corosync [MAIN  ] Corosync Cluster Engine ('1.2.7'):
started and ready to provide service.
Sep 13 02:23:27 corosync [MAIN  ] Corosync built-in features: nss rdma
Sep 13 02:23:27 corosync [MAIN  ] Successfully read main configuration
file '/etc/corosync/corosync.conf'
==============


Thanks,
Narendra.

On Mon, Sep 6, 2010 at 1:36 PM, MORITA Kazutaka
<morita.kazutaka at lab.ntt.co.jp> wrote:
> At Sat, 4 Sep 2010 11:36:39 +0530,
> Narendra Prasad Madanapalli wrote:
>>
>> Hi Steve,
>>
>> Please find below the output of ifconfig and the contents of corosync.log
>>
>> ===========ifconfig
>> [nlakn at naninf ~]$ ifconfig
>> eth0      Link encap:Ethernet  HWaddr 00:1B:24:69:92:11
>>           UP BROADCAST MULTICAST  MTU:1500  Metric:1
>>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>>           Interrupt:27 Base address:0xa000
>>
>> lo        Link encap:Local Loopback
>>           inet addr:127.0.0.1  Mask:255.0.0.0
>>           inet6 addr: ::1/128 Scope:Host
>>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>>           RX packets:8 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:0
>>           RX bytes:480 (480.0 b)  TX bytes:480 (480.0 b)
>>
>> virbr0    Link encap:Ethernet  HWaddr 22:83:14:EC:B9:66
>>           inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:0
>>           RX bytes:0 (0.0 b)  TX bytes:3925 (3.8 KiB)
>>
>> wlan0     Link encap:Ethernet  HWaddr 00:1E:4C:59:99:36
>>           inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
>>           inet6 addr: fe80::21e:4cff:fe59:9936/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:1671 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:1555 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:1724236 (1.6 MiB)  TX bytes:285862 (279.1 KiB)
>> ======================================================================
>>
>> =====corosync.log
>>  Sep 03 23:17:39 corosync [MAIN  ] Corosync Cluster Engine ('1.2.7'):
>> started and ready to provide service.
>> Sep 03 23:17:39 corosync [MAIN  ] Corosync built-in features: nss rdma
>> Sep 03 23:17:39 corosync [MAIN  ] Successfully read main configuration
>> file '/etc/corosync/corosync.conf'.
>> Sep 03 23:17:39 corosync [TOTEM ] Initializing transport (UDP/IP).
>> Sep 03 23:17:39 corosync [TOTEM ] Initializing transmit/receive
>> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>> Sep 03 23:17:39 corosync [TOTEM ] Could not set traffic priority.
>> (Socket operation on non-socket)
>
> It seems that corosync couldn't create the socket for some reason.
>
> Can you try 'corosync -f' to run corosync in foreground?  Corosync
> uses perror for printing some socket errors, so the output may tell us
> the error reason.
>
>
> Thanks
>
> Kazutaka
>
>
>> Sep 03 23:17:39 corosync [TOTEM ] The network interface
>> [fe80::21e:4cff:fe59:9936] is now up.
>> Sep 03 23:17:39 corosync [SERV  ] Service engine loaded: corosync
>> extended virtual synchrony service
>> Sep 03 23:17:39 corosync [SERV  ] Service engine loaded: corosync
>> configuration service
>> Sep 03 23:17:39 corosync [SERV  ] Service engine loaded: corosync
>> cluster closed process group service v1.01
>> Sep 03 23:17:39 corosync [SERV  ] Service engine loaded: corosync
>> cluster config database access v1.01
>> Sep 03 23:17:39 corosync [SERV  ] Service engine loaded: corosync
>> profile loading service
>> Sep 03 23:17:39 corosync [SERV  ] Service engine loaded: corosync
>> cluster quorum service v0.1
>> Sep 03 23:17:39 corosync [MAIN  ] Compatibility mode set to whitetank.
>>  Using V1 and V2 of the synchronization engine.
>> ==================================
>>
>>
>> I observed that computer is too slow to keyboard & mouse events when
>> corosync is started with IPv6 bindaddr.
>>
>>
>> Thanks,
>> Narendra.
>>
>> On Sat, Sep 4, 2010 at 2:36 AM, Steven Dake <sdake at redhat.com> wrote:
>> > Perhaps your ipv6 interface isn't setup properly or for some reason corosync
>> > can't bind to it or the multicast address.  Can you attach
>> > /var/log/cluster/corosync.log and output of ifconfig?
>> >
>> > Thanks
>> > -steve
>> >
>> > On 09/03/2010 12:36 PM, Narendra Prasad Madanapalli wrote:
>> >>
>> >> Thanks Steve. It works on Fedora13 after disabling selinux/firewall. A
>> >> similar kind of problem I encounter when corosync is started by
>> >> specifying IPv6 addr in corosync.conf file as follows:
>> >>
>> >> =======corosync.conf
>> >> compatibility: whitetank
>> >>
>> >> totem {
>> >>         version: 2
>> >>         secauth: off
>> >>         threads: 0
>> >>         nodeid: 1
>> >>         interface {
>> >>                 ringnumber: 0
>> >>                 nodeid: 1
>> >>                 bindnetaddr: fe80::21e:4cff:fe59:9936
>> >>                 mcastaddr:  ff05::1
>> >>                 mcastport: 5405
>> >>         }
>> >> }
>> >>
>> >> logging {
>> >>         fileline: off
>> >>         to_stderr: no
>> >>         to_logfile: yes
>> >>         to_syslog: yes
>> >>         logfile: /var/log/cluster/corosync.log
>> >>         debug: off
>> >>         timestamp: on
>> >>         logger_subsys {
>> >>                 subsys: AMF
>> >>                 debug: off
>> >>         }
>> >> }
>> >>
>> >> amf {
>> >>         mode: disabled
>> >>
>> >> ===================
>> >>
>> >> Corosync started successfully but sheepdog throws the same 'try again'
>> >> errors in sheepdog.log. I ensure ip6tables are stopped before starting
>> >> shepdog. Here, I am trying to fix addr_to_str() to support for IPv6
>> >> addresses.  I would apreciate if you can provide pointers to overcome
>> >> this error
>> >>
>> >>
>> >>
>> >> Thanks,
>> >> Narendra.
>> >>
>> >> On Wed, Aug 11, 2010 at 9:47 PM, Steven Dake<sdake at redhat.com>  wrote:
>> >>>
>> >>> On 08/11/2010 09:10 AM, Narendra Prasad Madanapalli wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I encounter mentioned error when sheep is started.
>> >>>>
>> >>>> I would appreciate if someone can help me to overcome these issues.
>> >>>>
>> >>>> Here is the details of corosync&    sheepdog:
>> >>>>
>> >>>> OS Distro: FC11
>> >>>>
>> >>>> Corosync:
>> >>>> corosynclib-devel-1.2.3-1.fc11.i586
>> >>>> corosync-1.2.3-1.fc11.i586
>> >>>> corosynclib-1.2.3-1.fc11.i586
>> >>>>
>> >>>
>> >>> You may have iptables enabled which blocks corosync from executing.
>> >>> Another
>> >>> common problem is selinux is enabled, which only works well on newer
>> >>> fedora
>> >>> versions.
>> >>>
>> >>> Regards
>> >>> -steve
>> >>>
>> >>>> Corosync log contents when it is started:
>> >>>> Aug 11 09:29:36 corosync [MAIN  ] Corosync Cluster Engine ('1.2.3'):
>> >>>> started and ready to provide service.
>> >>>> Aug 11 09:29:36 corosync [MAIN  ] Corosync built-in features: nss rdma
>> >>>> Aug 11 09:29:36 corosync [MAIN  ] Successfully read main configuration
>> >>>> file '/etc/corosync/corosync.conf'.
>> >>>> Aug 11 09:29:36 corosync [TOTEM ] Initializing transport (UDP/IP).
>> >>>> Aug 11 09:29:36 corosync [TOTEM ] Initializing transmit/receive
>> >>>> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>> >>>> Aug 11 09:29:36 corosync [TOTEM ] The network interface
>> >>>> [192.168.122.1] is now up.
>> >>>> Aug 11 09:29:36 corosync [SERV  ] Service engine loaded: corosync
>> >>>> extended virtual synchrony service
>> >>>> Aug 11 09:29:36 corosync [SERV  ] Service engine loaded: corosync
>> >>>> configuration service
>> >>>> Aug 11 09:29:36 corosync [SERV  ] Service engine loaded: corosync
>> >>>> cluster closed process group service v1.01
>> >>>> Aug 11 09:29:36 corosync [SERV  ] Service engine loaded: corosync
>> >>>> cluster config database access v1.01
>> >>>> Aug 11 09:29:36 corosync [SERV  ] Service engine loaded: corosync
>> >>>> profile loading service
>> >>>> Aug 11 09:29:36 corosync [SERV  ] Service engine loaded: corosync
>> >>>> cluster quorum service v0.1
>> >>>> Aug 11 09:29:36 corosync [MAIN  ] Compatibility mode set to whitetank.
>> >>>>  Using V1 and V2 of the synchronization engine.
>> >>>>
>> >>>>
>> >>>> corosync.conf:
>> >>>> # cat /etc/corosync/corosync.conf
>> >>>> # Please read the corosync.conf.5 manual page
>> >>>> compatibility: whitetank
>> >>>>
>> >>>> totem {
>> >>>>        version: 2
>> >>>>        secauth: off
>> >>>>        threads: 0
>> >>>>        interface {
>> >>>>                ringnumber: 0
>> >>>>                bindnetaddr: 192.168.122.1
>> >>>>                mcastaddr: 226.94.1.1
>> >>>>                mcastport: 5405
>> >>>>        }
>> >>>> }
>> >>>>
>> >>>> logging {
>> >>>>        fileline: off
>> >>>>        to_stderr: yes
>> >>>>        to_logfile: yes
>> >>>>        to_syslog: yes
>> >>>>        logfile: /tmp/corosync.log
>> >>>>        debug: off
>> >>>>        timestamp: on
>> >>>>        logger_subsys {
>> >>>>                subsys: AMF
>> >>>>                debug: off
>> >>>>        }
>> >>>> }
>> >>>>
>> >>>> amf {
>> >>>>        mode: disabled
>> >>>> }
>> >>>>
>> >>>> sheepdog.log:
>> >>>> Aug 11 09:48:05 worker_routine(215) started this thread 60
>> >>>> Aug 11 09:48:05 worker_routine(215) started this thread 61
>> >>>> Aug 11 09:48:05 worker_routine(215) started this thread 62
>> >>>> Aug 11 09:48:05 worker_routine(215) started this thread 63
>> >>>> Aug 11 09:48:06 create_cluster(1652) Failed to join the sheepdog
>> >>>> group, try again
>> >>>> Aug 11 09:48:07 create_cluster(1652) Failed to join the sheepdog
>> >>>> group, try again
>> >>>> Aug 11 09:48:08 create_cluster(1652) Failed to join the sheepdog
>> >>>> group, try again
>> >>>> Aug 11 09:48:09 create_cluster(1652) Failed to join the sheepdog
>> >>>> group, try again
>> >>>> Aug 11 09:48:10 create_cluster(1652) Failed to join the sheepdog
>> >>>> group, try again
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>> Narendra.
>> >>>
>> >>>
>> >
>> >
>> --
>> sheepdog mailing list
>> sheepdog at lists.wpkg.org
>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>



More information about the sheepdog mailing list