[sheepdog] howto stop a sheep?

Wed Jul 18 08:26:37 CEST 2012

On 07/18/2012 02:11 PM, Dietmar Maurer wrote:
> My current init.d script simply send a kill -15 to stop sheep daemons.
> 
> Is there a better way to do that?
> 
> When I try to restart the sheep I get (only sometimes): 
> 
> Jul 18 07:57:36 [main] crash_handler(402) sheep pid 10672 exited unexpectedly.
> Jul 18 07:57:37 [main] jrnl_recover(237) opening the directory /var/lib/sheepdog/disc1/journal/
> Jul 18 07:57:37 [main] jrnl_recover(242) starting journal recovery
> Jul 18 07:57:37 [main] jrnl_recover(298) journal recovery complete
> Jul 18 07:57:37 [main] init_sys_vdi_bitmap(306) found the working directory /var/lib/sheepdog/disc1/obj/
> Jul 18 07:57:37 [main] send_join_request(964) IPv4 ip:192.168.2.2 port:7000
> Jul 18 07:57:37 [main] main(313) sheepdog daemon (version 0.4.0) started
> Jul 18 07:57:37 [main] update_cluster_info(780) status = 4, epoch = 12, finished: 0
> Jul 18 07:57:37 [main] sd_check_join_cb(921) 192.168.2.2:7001: ret = 0x0, cluster_status = 0x4
> Jul 18 07:57:37 [main] update_cluster_info(780) status = 4, epoch = 12, finished: 1
> Jul 18 07:57:37 [main] cluster_sanity_check(501) joining node epoch too large: 13 vs 12
> Jul 18 07:57:37 [main] cluster_wait_for_join_check(524) transfer mastership (13, 12)
> Jul 18 07:57:37 [main] sd_check_join_cb(921) 192.168.2.2:7002: ret = 0x3, cluster_status = 0x4
> Jul 18 07:57:37 [main] __corosync_dispatch_one(312) failed to join sheepdog cluster: please retry when master is up
> Jul 18 07:57:37 [main] crash_handler(402) sheep pid 27848 exited unexpectedly.
> 
> Any idea why that happens. It work if I simply try again.
> 
> - Dietmar
> 

It seems it has something to do with patch set introduced by Hellwig ?

Thanks,
Yuan