[sheepdog] pidfile option? [main] crash_handler(408) sheep pid 31144 exited unexpectedly.
Jens WEBER
jweber at tek2b.org
Tue Jul 17 17:55:22 CEST 2012
The problem is back again .-(
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# /etc/init.d/sheepdog stop
[ ok ] Stopping sheepdog: sheepdog_1.
[ ok ] Stopping sheepdog: sheepdog_2.
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# rm -r /var/lib/sheepdog/disc2/*
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# rm -r /var/lib/sheepdog/disc1/*
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# /etc/init.d/sheepdog start
[ ok ] Starting sheepdog : sheepdog_1.
[ ok ] Starting sheepdog : sheepdog_2.
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# collie cluster format -H -c 1
using backend farm store
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# collie vdi create test 50M -P
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# collie node info
Id Size Used Use%
0 238 MB 32 MB 13%
1 238 MB 24 MB 10%
Total 475 MB 56 MB 11%
Total virtual image size 50 MB
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# /etc/init.d/sheepdog stop
[ ok ] Stopping sheepdog: sheepdog_1.
[ ok ] Stopping sheepdog: sheepdog_2.
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# /etc/init.d/sheepdog start
[ ok ] Starting sheepdog : sheepdog_1.
[ ok ] Starting sheepdog : sheepdog_2.
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# collie node info
[main] connect_to(234) failed to connect to localhost:7000: Connection refused
Failed to get node list
root at sheep01:/home/jens/sheepdog/debian/sheepdog-0.4.0-0+tek2b# /etc/init.d/sheepdog status
[FAIL] sheepdog_1 is not running ... failed!
[ ok ] sheepdog_2 is running.
> I can't explain. After done some tests with oder version and upgrade again to
> 0.4.0-0+tek2b-4 it works.
>
> By the first time the sheepdog dir disc2 was 100% used, can this explain it?
> After the last upgrade I delete disc1 and disc2, create new cluster. stop and
> start works now fine.
>
> Any Idea?
>
> > Just testing init.d startup script. First sheep crash when second sheep
> starts.
> >
> > /usr/sbin/sheep --pidfile /var/run/sheepdog_1.pid -d -p 7000 -v 32 -z 1
> > /var/lib/sheepdog/disc1
> > /var/lib/sheepdog/disc1/sheep.log
> > Jul 17 15:03:26 [main] jrnl_recover(237) opening the directory
> > /var/lib/sheepdog/disc1/journal/
> > Jul 17 15:03:26 [main] jrnl_recover(242) starting journal recovery
> > Jul 17 15:03:26 [main] jrnl_recover(298) journal recovery complete
> > Jul 17 15:03:26 [main] farm_init(333) use farm store driver
> > Jul 17 15:03:26 [main] init_sys_vdi_bitmap(306) found the working directory
> > /var/lib/sheepdog/disc1/obj/
> > Jul 17 15:03:26 [main] create_cluster(1095) use corosync cluster driver as
> > default
> > Jul 17 15:03:26 [main] create_cluster(1124) zone id = 1
> > Jul 17 15:03:26 [main] send_join_request(964) IPv4 ip:172.30.0.80 port:7000
> > Jul 17 15:03:26 [main] main(313) sheepdog daemon (version 0.4.0) started
> > Jul 17 15:03:26 [main] cdrv_cpg_confchg(568) mem:1, joined:1, left:0
> > Jul 17 15:03:26 [main] cdrv_cpg_deliver(454) 0
> > Jul 17 15:03:26 [main] sd_check_join_cb(897) IPv4 ip:172.30.0.80 port:7000
> > Jul 17 15:03:26 [main] cdrv_cpg_deliver(454) 1
> > Jul 17 15:03:26 [main] sd_join_handler(993) join IPv4 ip:172.30.0.80 port:7000
> > Jul 17 15:03:26 [main] sd_join_handler(995) [0] IPv4 ip:172.30.0.80 port:7000
> > Jul 17 15:03:26 [main] update_cluster_info(780) status = 4, epoch = 5,
> > finished: 0
> > Jul 17 15:03:26 [main] sockfd_cache_add_group(242) 1
> > Jul 17 15:03:26 [main] sd_join_handler(1004) join Sheepdog cluster
> >
> > /usr/sbin/sheep --pidfile /var/run/sheepdog_2.pid -d -p 7001 -v 32 -z 2
> > /var/lib/sheepdog/disc2
> > /var/lib/sheepdog/disc2/sheep.log
> > Jul 17 15:03:47 [main] jrnl_recover(237) opening the directory
> > /var/lib/sheepdog/disc2/journal/
> > Jul 17 15:03:47 [main] jrnl_recover(242) starting journal recovery
> > Jul 17 15:03:47 [main] jrnl_recover(298) journal recovery complete
> > Jul 17 15:03:47 [main] farm_init(333) use farm store driver
> > Jul 17 15:03:47 [main] init_sys_vdi_bitmap(306) found the working directory
> > /var/lib/sheepdog/disc2/obj/
> > Jul 17 15:03:47 [main] init_sys_vdi_bitmap(320) found the VDI object
> > 80fd366200000000
> > Jul 17 15:03:47 [main] create_cluster(1095) use corosync cluster driver as
> > default
> > Jul 17 15:03:47 [main] create_cluster(1124) zone id = 2
> > Jul 17 15:03:47 [main] send_join_request(964) IPv4 ip:172.30.0.80 port:7001
> > Jul 17 15:03:47 [main] main(313) sheepdog daemon (version 0.4.0) started
> > Jul 17 15:03:47 [main] cdrv_cpg_confchg(568) mem:2, joined:1, left:0
> > Jul 17 15:03:47 [main] cdrv_cpg_confchg(634) Not promoting because member is
> > not in our event list.
> > Jul 17 15:03:47 [main] cdrv_cpg_deliver(454) 0
> > Jul 17 15:03:47 [main] cdrv_cpg_deliver(454) 1
> > Jul 17 15:03:47 [main] sd_join_handler(993) join IPv4 ip:172.30.0.80 port:7001
> > Jul 17 15:03:47 [main] sd_join_handler(995) [0] IPv4 ip:172.30.0.80 port:7000
> > Jul 17 15:03:47 [main] sd_join_handler(995) [1] IPv4 ip:172.30.0.80 port:7001
> > Jul 17 15:03:47 [main] update_cluster_info(780) status = 4, epoch = 5,
> > finished: 0
> > Jul 17 15:03:47 [main] sockfd_cache_add_group(242) 2
> > Jul 17 15:03:47 [main] sd_join_handler(1004) join Sheepdog cluster
> > Jul 17 15:03:47 [main] cdrv_cpg_confchg(568) mem:1, joined:0, left:1
> > Jul 17 15:03:47 [main] sd_leave_handler(1062) leave IPv4 ip:172.30.0.80
> > port:7000
> > Jul 17 15:03:47 [main] sd_leave_handler(1064) [0] IPv4 ip:172.30.0.80
> port:7001
> > Jul 17 15:03:47 [main] sockfd_cache_del(218) 172.30.0.80:7000, count 1
> > /var/lib/sheepdog/disc1/sheep.log
> > Jul 17 15:03:47 [main] cdrv_cpg_confchg(568) mem:2, joined:1, left:0
> > Jul 17 15:03:47 [main] cdrv_cpg_deliver(454) 0
> > Jul 17 15:03:47 [main] sd_check_join_cb(921) 172.30.0.80:7001: ret = 0x2,
> > cluster_status = 0x4
> > Jul 17 15:03:47 [main] cdrv_cpg_deliver(454) 1
> > Jul 17 15:03:47 [main] crash_handler(408) sheep pid 31255 exited unexpectedly.
> >
> > Thanks Jens
More information about the sheepdog
mailing list