[sheepdog] I got a“Waiting for other nodes to join cluster”
Yunkai Zhang
yunkai.me at gmail.com
Mon Aug 20 04:04:48 CEST 2012
On Mon, Aug 20, 2012 at 9:57 AM, Brook <jingyu_chen at hotmail.com> wrote:
> Hi All.
> After a failure of switcher, my sheepdog cluster can't run.
> I got the message below, what should i do ?
> [root at 17-IDC-D-2115 ~]# collie vdi list
> Name Id Size Used Shared Creation time VDI id Tag
> Failed to read object 8083d2b800000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 8083d46b00000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 8083d7d100000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 8083d7d200000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 8083db3700000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 8083de9d00000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 809bee7c00000000 Waiting for other nodes to join
> cluster
> Failed to read inode header
> Failed to read object 809bf02f00000000 Waiting for other nodes to join
> cluster
> ......
>
> [root at 17-IDC-D-2115 ~]# collie vdi create vol1 1G
> Failed to create VDI vol1: Waiting for other nodes to join cluster
>
> [root at 17-IDC-D-2115 ~]# corosync-cpgtool
> Group Name PID Node ID
> sheepdog
> 12747 1100130496 (192.168.146.65)
> 8228 1150462144 (192.168.146.68)
> 15328 1133684928 (192.168.146.67)
> 2076 1116907712 (192.168.146.66)
>
> [root at 17-IDC-D-2115 ~]# collie node list
> M Id Host:Port V-Nodes Zone
> - 0 192.168.146.65:7000 64 1100130496
> - 1 192.168.146.66:7000 64 1116907712
> - 2 192.168.146.67:7000 64 1133684928
> - 3 192.168.146.68:7000 64 1150462144
>
> [root at 17-IDC-D-2115 ~]# collie node info
> Id Size Used Use%
> Cannot get information from any nodes
>
> [root at 17-IDC-D-2115 ~]# collie cluster info
> Cluster status: Waiting for other nodes to join cluster
>
> Cluster created at Mon Jul 9 16:57:18 2012
>
> Epoch Time Version
> 2012-08-16 14:51:45 37 [192.168.146.65:7000, 192.168.146.66:7000,
> 192.168.146.67:7000, 192.168.146.68:7000, 192.168.146.69:7000]
> 2012-08-16 14:51:44 36 [192.168.146.65:7000, 192.168.146.66:7000,
> 192.168.146.67:7000, 192.168.146.68:7000, 192.168.146.69:7000,
> 192.168.146.71:7000]
The 36th version have 6 nodes: 192.168.146[65-69,71], but in the 37th
version you didn't start 192.168.146.71 node.
> ......
>
> [root at 17-IDC-D-2115 ~]# tail -n30 /data/sheepdog/sheep.log
> Aug 20 09:28:11 [main] listen_handler(819) accepted a new connection: 13
> Aug 20 09:28:11 [main] client_rx_handler(577) connection from: 13, ::1:45306
> Aug 20 09:28:11 [main] queue_request(323) 82
> Aug 20 09:28:11 [io 18] do_process_work(990) 82, 0 , 37
> Aug 20 09:28:11 [main] client_tx_handler(663) connection from: 13, ::1:45306
> Aug 20 09:28:11 [main] client_handler(764) connection seems to be dead
> Aug 20 09:28:11 [main] clear_client(703) refcnt:0, fd:13, ::1:45306
> Aug 20 09:28:11 [main] destroy_client(672) connection from: ::1:45306
> Aug 20 09:28:11 [main] listen_handler(819) accepted a new connection: 13
> Aug 20 09:28:11 [main] client_rx_handler(577) connection from: 13, ::1:45307
> Aug 20 09:28:11 [main] queue_request(323) 11
> Aug 20 09:28:11 [main] client_tx_handler(663) connection from: 13, ::1:45307
> Aug 20 09:28:11 [main] client_handler(764) connection seems to be dead
> Aug 20 09:28:11 [main] clear_client(703) refcnt:0, fd:13, ::1:45307
> Aug 20 09:28:11 [main] destroy_client(672) connection from: ::1:45307
> Aug 20 09:28:13 [main] listen_handler(819) accepted a new connection: 13
> Aug 20 09:28:13 [main] client_rx_handler(577) connection from: 13, ::1:45308
> Aug 20 09:28:13 [main] queue_request(323) 82
> Aug 20 09:28:13 [io 19] do_process_work(990) 82, 0 , 37
> Aug 20 09:28:13 [main] client_tx_handler(663) connection from: 13, ::1:45308
> Aug 20 09:28:13 [main] client_handler(764) connection seems to be dead
> Aug 20 09:28:13 [main] clear_client(703) refcnt:0, fd:13, ::1:45308
> Aug 20 09:28:13 [main] destroy_client(672) connection from: ::1:45308
> Aug 20 09:28:13 [main] listen_handler(819) accepted a new connection: 13
> Aug 20 09:28:13 [main] client_rx_handler(577) connection from: 13, ::1:45309
> Aug 20 09:28:13 [main] queue_request(323) 11
> Aug 20 09:28:13 [main] client_tx_handler(663) connection from: 13, ::1:45309
> Aug 20 09:28:13 [main] client_handler(764) connection seems to be dead
> Aug 20 09:28:13 [main] clear_client(703) refcnt:0, fd:13, ::1:45309
> Aug 20 09:28:13 [main] destroy_client(672) connection from: ::1:45309
>
>
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
>
--
Yunkai Zhang
Work at Taobao
More information about the sheepdog
mailing list