[sheepdog-users] Collie node kill and md
Valerio Pachera
sirio81 at gmail.com
Wed May 22 09:07:11 CEST 2013
Hi, on my production cluster I tried to kill one of the 3 nodes and
restart sheep right after.
(Sheepdog daemon version 0.5.5_335_g25a93bf)
root at sheepdog004:~# collie node list
M Id Host:Port V-Nodes Zone
- 0 192.168.6.41:7000 85 688302272
- 1 192.168.6.42:7000 85 705079488
- 2 192.168.6.44:7000 21 738633920
root at sheepdog004:~# collie node info
Id Size Used Use%
0 1.6 TB 1.0 TB 64%
1 1.6 TB 978 GB 57%
2 2.1 TB 236 GB 10%
Total 5.4 TB 2.2 TB 41%
Total virtual image size 1.2 TB
root at sheepdog004:~# collie node kill 2
root at sheepdog004:~# sheep -w size=20000
/mnt/wd_WCAYUEP99298,/mnt/wd_WCAYUEP99298/obj,/mnt/wd_WCAWZ1588874
root at sheepdog004:~# collie node info
Id Size Used Use%
0 1.6 TB 1.0 TB 64%
1 1.6 TB 978 GB 57%
2 466 GB 72 MB 0%
Total 3.7 TB 2.0 TB 53%
root at sheepdog004:~# collie node md info
Id Size Use Path
0 422 GB 0.0 MB /mnt/wd_WCAYUEP99298/obj
1 1.6 TB 980 MB /mnt/wd_WCAWZ1588874
root at sheepdog004:~# collie node recovery
Nodes In Recovery:
Id Host:Port V-Nodes Zone
2 192.168.6.44:7000 21 738633920
sheep.log
May 22 08:54:32 [main] main(752) shutdown
May 22 08:54:38 [main] md_add_disk(164) /mnt/wd_WCAYUEP99298/obj, nr 1
May 22 08:54:38 [main] md_add_disk(164) /mnt/wd_WCAWZ1588874, nr 2
May 22 08:54:38 [main] send_join_request(1082) IPv4 ip:192.168.6.44 port:7000
May 22 08:54:38 [main] check_host_env(381) WARN: Allowed open files
1024 too small, suggested 1024000
May 22 08:54:38 [main] check_host_env(390) Allowed core file size 0,
suggested unlimited
May 22 08:54:38 [main] main(745) sheepdog daemon (version
0.5.5_335_g25a93bf) started
May 22 08:54:38 [main] update_cluster_info(862) status = 1, epoch = 4,
finished: 0
May 22 08:54:40 [rw 17255] recover_object_work(205) done:0
count:60534, oid:c8d1280002992d
May 22 08:54:42 [rw 17255] recover_object_work(205) done:1
count:60534, oid:c8d1280000081f
May 22 08:54:43 [rw 17255] recover_object_work(205) done:2
count:60534, oid:c8d1280003c3d0
...
May 22 08:54:49 [gway 17253] gateway_read_obj(60) local read
80c8be4d00000000 failed, No object found
May 22 08:54:49 [gway 17253] gateway_read_obj(60) local read
80e149bf00000000 failed, No object found
May 22 08:54:49 [rw 17255] recover_object_work(205) done:19
count:60534, oid:c8d12800018e38
...
May 22 08:55:16 [gway 17253] gateway_read_obj(60) local read
80c8be4d00000000 failed, No object found
May 22 08:55:16 [gway 17253] gateway_read_obj(60) local read
80e149bf00000000 failed, No object found
May 22 08:55:16 [rw 17255] recover_object_work(205) done:109
count:60534, oid:c8d1280000ff6b
...
What do you think?
Is everything messed up?
More information about the sheepdog-users
mailing list