[sheepdog-users] Upgrade to 0.7.0_26_gc65bb2f fails to start

Andrew J. Hobbs ajhobbs at desu.edu
Fri Aug 23 20:01:09 CEST 2013


A follow-up to the follow-up.  The hint was in build_node_list(460) nr_sd_nodes:2.  Turned out there was a hung sheep instance on node 3 that never went down and caused all attempts at restarting other nodes to immediately halt.  I killed it, rebooted everything for good measure.  Back up, recovered and instances running again.


On 08/23/2013 01:17 PM, Andrew J. Hobbs wrote:

Reverted to the previously running rc of 0.7.0.  This now also fails.  Attempting a reboot now.  Will try again and attach a log segment with debug messages enabled.

<large number of these snipped>
Aug 23 13:14:50  DEBUG [main] add_to_lru_cache(684) oid cf50080000091f added
Aug 23 13:14:50  DEBUG [main] load_cache_object(1262) cf50080000091f
Aug 23 13:14:50  DEBUG [main] add_to_lru_cache(684) oid cf500300000036 added
Aug 23 13:14:50  DEBUG [main] load_cache_object(1262) cf500300000036
Aug 23 13:14:50   INFO [main] check_host_env(465) Allowed open files 100000, suggested 1024000
Aug 23 13:14:50  DEBUG [main] check_host_env(471) Allowed core file size 0, suggested unlimited
Aug 23 13:14:50   INFO [main] main(854) sheepdog daemon (version 0.7.0_26_gc65bb2f) started
Aug 23 13:14:50  DEBUG [main] zk_event_handler(1012) 1, 1761
Aug 23 13:14:50  DEBUG [main] zk_queue_pop_advance(402) /sheepdog/queue/0000001761, type:2, len:114872, pos:1761
Aug 23 13:14:50  DEBUG [main] zk_handle_accept(854) ACCEPT
Aug 23 13:14:50  DEBUG [main] init_node_list(838) 1
Aug 23 13:14:50  DEBUG [main] zk_handle_accept(859) IPv4 ip:10.254.0.1 port:7000
Aug 23 13:14:50  DEBUG [main] zk_handle_accept(865) create path:/sheepdog/member/IPv4 ip:10.254.0.1 port:7000
Aug 23 13:14:50  DEBUG [main] zk_watcher(522) path:/sheepdog/member/IPv4 ip:10.254.0.1 port:7000, type:1
Aug 23 13:14:50  DEBUG [main] build_node_list(460) nr_sd_nodes:2
Aug 23 13:14:50  DEBUG [main] sd_accept_handler(886) join IPv4 ip:10.254.0.1 port:7000
Aug 23 13:14:50  DEBUG [main] sd_accept_handler(888) [0] IPv4 ip:10.254.0.1 port:7000
Aug 23 13:14:50  DEBUG [main] sd_accept_handler(888) [1] IPv4 ip:10.254.0.3 port:7000
Aug 23 13:14:50  DEBUG [main] zk_watcher(522) path:/sheepdog/member, type:4
Aug 23 13:14:50   INFO [main] main(861) shutdown
Aug 23 13:14:50   INFO [main] zk_leave(780) leaving from cluster
Aug 23 13:14:50  DEBUG [main] zk_watcher(522) path:/sheepdog/queue/0000001762, type:1
Aug 23 13:14:50  DEBUG [main] zk_queue_push(362) create path:/sheepdog/queue/0000001762, queue_pos:0000001762, len:144
Aug 23 13:14:50  DEBUG [main] zk_watcher(522) path:/sheepdog/member/IPv4 ip:10.254.0.1 port:7000, type:2
Aug 23 13:14:50   INFO [main] main(866) cleaning journal file
Aug 23 13:14:50  DEBUG [main] zk_queue_push(362) create path:/sheepdog/queue/0000001763, queue_pos:0000001762, len:144

Note, I'm not seeing anything that indicates an issue, simply started then stopped.
Is it possible the cluster shutdown command has persisted in zookeeper or some other location?

On 08/23/2013 12:57 PM, Andrew J. Hobbs wrote:

Aug 23 12:52:05   INFO [main] send_join_request(770) IPv4 ip:10.254.0.1
port:7000
Aug 23 12:52:05  ERROR [main] for_each_object_in_stale(383)
/var/lib/sheepdog/obj/.stale
Aug 23 12:52:05   INFO [main] check_host_env(465) Allowed open files
100000, suggested 1024000
Aug 23 12:52:05   INFO [main] main(854) sheepdog daemon (version
0.7.0_26_gc65bb2f) started
Aug 23 12:52:05   INFO [main] main(861) shutdown
Aug 23 12:52:05   INFO [main] zk_leave(780) leaving from cluster

For now I'm going to revert to a prior build, but I'm not sure how to
proceed.  The /var/lib/sheepdog/obj/.stale directory has no content.

This build was pulled from git approximately 20 minutes ago.










-------------- next part --------------
A non-text attachment was scrubbed...
Name: ajhobbs.vcf
Type: text/x-vcard
Size: 353 bytes
Desc: ajhobbs.vcf
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20130823/83b5853e/attachment-0005.vcf>


More information about the sheepdog-users mailing list