[sheepdog-users] Unexpeted freeze of sheep on one node

Micha Kersloot micha at kovoks.nl
Wed Nov 19 10:44:34 CET 2014


Hi,

It looks like a network problem or failing harddrive to me.

Met vriendelijke groet,

Micha Kersloot

Blijf op de hoogte en ontvang de laatste tips over Zimbra/KovoKs Contact:
http://twitter.com/kovoks

KovoKs B.V. is ingeschreven onder KvK nummer: 11033334

----- Original Message -----
> From: "Valerio Pachera" <sirio81 at gmail.com>
> To: "Lista sheepdog user" <sheepdog-users at lists.wpkg.org>
> Sent: Wednesday, November 19, 2014 10:32:03 AM
> Subject: Re: [sheepdog-users] Unexpeted freeze of sheep on one node

> Last night I inserted back node id0 (without removing metadata).
> Recovery took very long, till 8:49 of this morning.
> Once done, sheep was frozen again.
> After 10 minutes I had to kill it.
> 
> On node id0 there are no useful messages (sheep.log)
> Nov 19 09:37:40   INFO [main] recover_object_main(863) object recovery
> progress  98%
> Nov 19 09:43:59   INFO [main] recover_object_main(863) object recovery
> progress  99%
> Nov 19 09:49:54 NOTICE [main] cluster_recovery_completion(703) all
> nodes are recovered, epoch 25
> 
> On node id1 I see a huge amount of this messages
> Nov 19 09:58:33  ERROR [gway 8476] sockfd_cache_get_long(348) fallback
> to non-io connection
> Nov 19 09:58:33  ERROR [gway 8628] connect_to(193) failed to connect
> to 192.168.5.44:7000: Connection refused
> Nov 19 09:58:33  ERROR [gway 8630] connect_to(193) failed to connect
> to 192.168.5.44:7000: Connection refused
> Nov 19 09:58:33  ERROR [gway 6514] connect_to(193) failed to connect
> to 192.168.5.44:3333: Connection refused
> Nov 19 09:58:33  ERROR [gway 8628] connect_to(193) failed to connect
> to 192.168.5.44:7000: Connection refused
> 
> Removing this 'connection refused' messages, I see repeating the
> poll-wait and 'failed to connect' till I killed the node
> grep 'Nov 19' sheep.log | grep -v 'Connection refused' | grep -v
> 'fallback to non-io connection'
> <cut>
> Nov 19 09:45:04  ERROR [io 7515] sheep_exec_req(1096) failed Failed to
> find requested tag
> Nov 19 09:45:04  ERROR [io 7515] sheep_exec_req(1096) failed Failed to
> find requested tag
> Nov 19 09:45:04  ERROR [io 7515] sheep_exec_req(1096) failed Failed to
> find requested tag
> Nov 19 09:49:54 NOTICE [main] cluster_recovery_completion(703) all
> nodes are recovered, epoch 25
> Nov 19 09:50:08   WARN [gway 8629] wait_forward_request(389) poll
> timeout 1, disks of some nodes or network is busy. Going to poll-wait
> again
> Nov 19 09:50:08   WARN [gway 8628] wait_forward_request(389) poll
> timeout 1, disks of some nodes or network is busy. Going to poll-wait
> again
> Nov 19 09:50:13   WARN [gway 8629] wait_forward_request(389) poll
> timeout 1, disks of some nodes or network is busy. Going to poll-wait
> again
> <cut>
> Nov 19 09:51:19  ERROR [gway 8630] connect_to(193) failed to connect
> to 192.168.5.44:3333: Operation now in progress
> 
> I don't understand what wrong with this node.
> --
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users



More information about the sheepdog-users mailing list