>>copies=3 means you at least should have >= 3 sheep daemon available or >>the cluster will go to halted state (not serving any IO requests at all) Ok,thanks didn't know that. It's more clear now. >>So in your configuration, after killing one daemon, only 2 left right? >>Then the cluster is going to halted, thought the gateway shouldn't get >>crashed. mmm,I have redone the test, and gateway didn't have crash this time. l 04 10:15:05 [gateway 60319] wait_forward_write(179) remote node might have gone away Jul 04 10:15:05 [gateway 60321] wait_forward_write(179) remote node might have gone away Jul 04 10:15:05 [gateway 60322] wait_forward_write(179) remote node might have gone away Jul 04 10:15:05 [gateway 60320] wait_forward_write(179) remote node might have gone away Jul 04 10:15:05 [gateway 60323] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60326] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60324] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60325] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60327] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60328] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60329] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60330] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60331] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60332] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60333] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60334] connect_to(234) failed to connect to 10.6.0.102:7001: Connection refused Jul 04 10:15:05 [gateway 60332] wait_forward_write(187) fail 19 dind't have see your previous mail, I'll retry and check if I find a core file. Thanks for the help - Alexandre ----- Mail original ----- De: "Liu Yuan" <namei.unix at gmail.com> À: "Alexandre DERUMIER" <aderumier at odiso.com> Cc: sheepdog-users at lists.wpkg.org Envoyé: Mercredi 4 Juillet 2012 09:47:04 Objet: Re: [sheepdog-users] gateway crashing if 1 node fail On 07/04/2012 03:15 PM, Alexandre DERUMIER wrote: > Hi, > I'm using a cluster with 3 servers, > each server with 1 sheepdog daemon gateway only (-g -p 7000), and 1 sheepdog for disk (-p 7001) > > server1 :10.6.0.100 > server2 :10.6.0.101 > server3 :10.6.0.102 > > cluster is formatted with: > collie cluster format --copies=3 > copies=3 means you at least should have >= 3 sheep daemon available or the cluster will go to halted state (not serving any IO requests at all) So in your configuration, after killing one daemon, only 2 left right? Then the cluster is going to halted, thought the gateway shouldn't get crashed. Thanks, Yuan -- -- Alexandre D e rumier Ingénieur Systèmes et Réseaux Fixe : 03 20 68 88 85 Fax : 03 20 68 90 88 45 Bvd du Général Leclerc 59100 Roubaix 12 rue Marivaux 75002 Paris |