On Mon, Aug 20, 2012 at 9:57 AM, Brook <jingyu_chen at hotmail.com> wrote: > Hi All. > After a failure of switcher, my sheepdog cluster can't run. > I got the message below, what should i do ? > [root at 17-IDC-D-2115 ~]# collie vdi list > Name Id Size Used Shared Creation time VDI id Tag > Failed to read object 8083d2b800000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 8083d46b00000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 8083d7d100000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 8083d7d200000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 8083db3700000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 8083de9d00000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 809bee7c00000000 Waiting for other nodes to join > cluster > Failed to read inode header > Failed to read object 809bf02f00000000 Waiting for other nodes to join > cluster > ...... > > [root at 17-IDC-D-2115 ~]# collie vdi create vol1 1G > Failed to create VDI vol1: Waiting for other nodes to join cluster > > [root at 17-IDC-D-2115 ~]# corosync-cpgtool > Group Name PID Node ID > sheepdog > 12747 1100130496 (192.168.146.65) > 8228 1150462144 (192.168.146.68) > 15328 1133684928 (192.168.146.67) > 2076 1116907712 (192.168.146.66) > > [root at 17-IDC-D-2115 ~]# collie node list > M Id Host:Port V-Nodes Zone > - 0 192.168.146.65:7000 64 1100130496 > - 1 192.168.146.66:7000 64 1116907712 > - 2 192.168.146.67:7000 64 1133684928 > - 3 192.168.146.68:7000 64 1150462144 > > [root at 17-IDC-D-2115 ~]# collie node info > Id Size Used Use% > Cannot get information from any nodes > > [root at 17-IDC-D-2115 ~]# collie cluster info > Cluster status: Waiting for other nodes to join cluster > > Cluster created at Mon Jul 9 16:57:18 2012 > > Epoch Time Version > 2012-08-16 14:51:45 37 [192.168.146.65:7000, 192.168.146.66:7000, > 192.168.146.67:7000, 192.168.146.68:7000, 192.168.146.69:7000] > 2012-08-16 14:51:44 36 [192.168.146.65:7000, 192.168.146.66:7000, > 192.168.146.67:7000, 192.168.146.68:7000, 192.168.146.69:7000, > 192.168.146.71:7000] The 36th version have 6 nodes: 192.168.146[65-69,71], but in the 37th version you didn't start 192.168.146.71 node. > ...... > > [root at 17-IDC-D-2115 ~]# tail -n30 /data/sheepdog/sheep.log > Aug 20 09:28:11 [main] listen_handler(819) accepted a new connection: 13 > Aug 20 09:28:11 [main] client_rx_handler(577) connection from: 13, ::1:45306 > Aug 20 09:28:11 [main] queue_request(323) 82 > Aug 20 09:28:11 [io 18] do_process_work(990) 82, 0 , 37 > Aug 20 09:28:11 [main] client_tx_handler(663) connection from: 13, ::1:45306 > Aug 20 09:28:11 [main] client_handler(764) connection seems to be dead > Aug 20 09:28:11 [main] clear_client(703) refcnt:0, fd:13, ::1:45306 > Aug 20 09:28:11 [main] destroy_client(672) connection from: ::1:45306 > Aug 20 09:28:11 [main] listen_handler(819) accepted a new connection: 13 > Aug 20 09:28:11 [main] client_rx_handler(577) connection from: 13, ::1:45307 > Aug 20 09:28:11 [main] queue_request(323) 11 > Aug 20 09:28:11 [main] client_tx_handler(663) connection from: 13, ::1:45307 > Aug 20 09:28:11 [main] client_handler(764) connection seems to be dead > Aug 20 09:28:11 [main] clear_client(703) refcnt:0, fd:13, ::1:45307 > Aug 20 09:28:11 [main] destroy_client(672) connection from: ::1:45307 > Aug 20 09:28:13 [main] listen_handler(819) accepted a new connection: 13 > Aug 20 09:28:13 [main] client_rx_handler(577) connection from: 13, ::1:45308 > Aug 20 09:28:13 [main] queue_request(323) 82 > Aug 20 09:28:13 [io 19] do_process_work(990) 82, 0 , 37 > Aug 20 09:28:13 [main] client_tx_handler(663) connection from: 13, ::1:45308 > Aug 20 09:28:13 [main] client_handler(764) connection seems to be dead > Aug 20 09:28:13 [main] clear_client(703) refcnt:0, fd:13, ::1:45308 > Aug 20 09:28:13 [main] destroy_client(672) connection from: ::1:45308 > Aug 20 09:28:13 [main] listen_handler(819) accepted a new connection: 13 > Aug 20 09:28:13 [main] client_rx_handler(577) connection from: 13, ::1:45309 > Aug 20 09:28:13 [main] queue_request(323) 11 > Aug 20 09:28:13 [main] client_tx_handler(663) connection from: 13, ::1:45309 > Aug 20 09:28:13 [main] client_handler(764) connection seems to be dead > Aug 20 09:28:13 [main] clear_client(703) refcnt:0, fd:13, ::1:45309 > Aug 20 09:28:13 [main] destroy_client(672) connection from: ::1:45309 > > > -- > sheepdog mailing list > sheepdog at lists.wpkg.org > http://lists.wpkg.org/mailman/listinfo/sheepdog > -- Yunkai Zhang Work at Taobao |