[sheepdog-users] Problem rejoining cluster

Cristian Del Carlo cristian.delcarlo at targetsolutions.it
Mon Jul 9 11:36:41 CEST 2018


Hi Takashi,

thanks for your help.

I stopped the node for a while and then i restarted it, now the service is
up and the recovery is running.

Thanks,

Cristian


2018-07-09 10:32 GMT+02:00 Takashi Menjo <menjo.takashi at lab.ntt.co.jp>:

> Hello,
>
>
> > I have to reboot a node and  then on this node sheepdog fail to start
> with this
> > error:
> > [..]
> > Jul  9 09:12:00 node02 sheep[3769]: ERROR [main] zk_join(1022) Previous
> > zookeeper session exist, shoot myself. Please wait for 30 seconds to
> join me
> > again.
>
>
> Did you try to restart the node __within 30 seconds of node down__ ?
> If so, please wait for a while, as ERROR log says.
>
> Then, type "dog node list" to check whether the restarting node appears or
> not.
> If __not__, you can restart the node to rejoin your cluster.
>
>
> Regards,
> Takashi
>
> --
> Takashi Menjo - NTT Software Innovation Center
> <menjo.takashi at lab.ntt.co.jp>
>
> > -----Original Message-----
> > From: sheepdog-users [mailto:sheepdog-users-bounces at lists.wpkg.org] On
> > Behalf Of Cristian Del Carlo
> > Sent: Monday, July 9, 2018 4:17 PM
> > To: sheepdog-users at lists.wpkg.org
> > Subject: [sheepdog-users] Problem rejoining cluster
> >
> > Hi,
> >
> > I have a cluster with 4 node.
> >
> > All nodes are installed with centos 7.x, zookeeper 3.4.6 and sheepdog
> 1.0.1.
> > I have to reboot a node and  then on this node sheepdog fail to start
> with this
> > error:
> >
> > Jul  9 09:12:00 node02 sheep[3769]:  INFO [main] zk_init(1503) the
> negociated
> > session timeout is 30000
> > Jul  9 09:12:00 node02 sheep[3769]: NOTICE [main] get_local_addr(551)
> found
> > IPv4 address
> > Jul  9 09:12:00 node02 sheep[3769]:  INFO [main] send_join_request(1093)
> IPv4
> > ip:10.0.0.34 port:7001 going to join the cluster
> > Jul  9 09:12:00 node02 systemd: sheepdoghd.service never wrote its PID
> file.
> > Failing.
> > Jul  9 09:12:00 node02 sheep[3769]: ERROR [main] zk_join(1022) Previous
> > zookeeper session exist, shoot myself. Please wait for 30 seconds to
> join me
> > again.
> > Jul  9 09:12:00 node02 systemd: Failed to start Sheepdog QEMU/KVM Block
> > Storage.
> > Jul  9 09:12:00 node02 systemd: Unit sheepdoghd.service entered failed
> state.
> > Jul  9 09:12:00 node02 systemd: sheepdoghd.service failed.
> > Jul  9 09:12:00 node02 systemd: sheepdoghd.service holdoff time over,
> > scheduling restart.
> > Jul  9 09:12:00 node02 systemd: Starting Sheepdog QEMU/KVM Block
> Storage...
> >
> >
> >
> > Could you suggest me how to solve?
> >
> > Thanks in advance for your advice.
> >
> > Cristian
> >
>
>
>
> --
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20180709/0fbe06c2/attachment.html>


More information about the sheepdog-users mailing list