On 07/04/2012 03:15 PM, Alexandre DERUMIER wrote: > Hi, > I'm using a cluster with 3 servers, > each server with 1 sheepdog daemon gateway only (-g -p 7000), and 1 sheepdog for disk (-p 7001) > > server1 :10.6.0.100 > server2 :10.6.0.101 > server3 :10.6.0.102 > > cluster is formatted with: > collie cluster format --copies=3 > > > I'm launching a fio benchmark from the vm,using gateway 10.6.0.100:7000 then kill the sheep daemon on 10.6.0.102:7001. > > then gateway (10.6.0.100:7000) is crashing after failed try to connect 10.6.0.102:7001 . > Others daemons works fine. > any idea ? > > Jul 04 09:03:38 [gateway 160310] do_read(268) failed to read from socket: 0 > Jul 04 09:03:38 [gateway 160313] do_read(268) failed to read from socket: 0 > Jul 04 09:03:38 [gateway 160310] wait_forward_write(179) remote node might have gone away > Jul 04 09:03:38 [gateway 160312] do_read(268) failed to read from socket: 0 > Jul 04 09:03:38 [gateway 160313] wait_forward_write(179) remote node might have gone away > Jul 04 09:03:38 [gateway 160316] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160312] wait_forward_write(179) remote node might have gone away > Jul 04 09:03:38 [gateway 160315] do_read(268) failed to read from socket: 0 > Jul 04 09:03:38 [gateway 160314] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160317] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160316] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160315] wait_forward_write(179) remote node might have gone away > Jul 04 09:03:38 [gateway 160318] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160314] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160317] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160321] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160319] do_read(268) failed to read from socket: 0 > Jul 04 09:03:38 [gateway 160318] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160323] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160320] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160321] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160324] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160319] wait_forward_write(179) remote node might have gone away > Jul 04 09:03:38 [gateway 160323] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160320] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160322] do_read(268) failed to read from socket: 0 > Jul 04 09:03:38 [gateway 160326] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160324] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160327] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160322] wait_forward_write(179) remote node might have gone away > Jul 04 09:03:38 [gateway 160328] do_write(301) failed to write to socket: Broken pipe > Jul 04 09:03:38 [gateway 160326] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160327] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160328] send_req(337) failed to send request 3, 4096: Broken pipe > Jul 04 09:03:38 [gateway 160329] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160330] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160333] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160334] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160332] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160331] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160335] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160337] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160338] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [gateway 160336] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused > Jul 04 09:03:38 [main] crash_handler(408) sheep pid 2148 exited unexpectedly. > Hi Alexandre, Can you find a file named 'core' in your /store directory? If so, please run 'gdb sheep /store/core' and type 'where' command, then please paste the output onto the list. Also, would you enable '-d' option for sheep for more debug output and attach it to the mailing list? Thanks, Yuan |