Hi, Regarding 7) , just started a clean test - started sheepdog on 3 nodes with empty btrfs and run "shepherd mkfs --copies=2" and this time it failed from the beginning - the details fire-srv2 ~ # shepherd info -t dog Idx Node id (FNV-1a) - Host:Port -------------------------------------------------- 0 2f1b:ef25:9c29:32a9 - 192.1.1.5:7000 1 670d:187d:7eac:8b45 - 192.1.1.3:7000 * 2 22b6:4eb1:7fe0:aa0d - 192.1.1.4:7000 fire-srv2 ~ # shepherd info -t sheep Id Size Used Use% 0 14G 0G 0% 1 14G 0G 0% 2 14G 0G 0% Total 44G 0G 0%, total virtual VDI Size 5G fire-srv2 ~ # shepherd info -t vm Name |Vdi size |Allocated| Shared | Status ----------------+---------+---------+---------+------------ fire-srv2 ~ # shepherd info -t vdi fire-srv2 ~ # fire-srv3 ~ # kvm-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa find_vdi_name 1041: Invalid error code, zopa find_vdi_name 1041: Invalid error code, zopa qemu-img: Could not open 'zopa' fire-srv3 ~ # kvm-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa1 find_vdi_name 1041: Invalid error code, zopa1 find_vdi_name 1041: Invalid error code, zopa1 qemu-img: Could not open 'zopa1' fire-srv3 ~ #cp /dev/null /var/log/messages fire-srv3 ~ # kvm-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa2 find_vdi_name 1041: Invalid error code, zopa2 find_vdi_name 1041: Invalid error code, zopa2 qemu-img: Could not open 'zopa2' fire-srv3 ~ # cat /var/log/messages Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 8 Jan 3 09:53:43 localhost collie: cluster_queue_request(175) 0x188c4c0 19 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 8 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 8 Jan 3 09:53:43 localhost collie: cluster_queue_request(175) 0x18824b0 11 Jan 3 09:53:43 localhost collie: sd_deliver(490) op: 2, done: 0, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: __sd_deliver(437) op: 2, done: 0, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: add_vdi(99) zopa2 (5) 5368709120, base: 0 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 25 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 25 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 25 Jan 3 09:53:43 localhost collie: so_queue_request(599) c0000 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 25 Jan 3 09:53:43 localhost collie: add_vdi(129) zopa2 (5) 5368709120, base: 786432 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 25 Jan 3 09:53:43 localhost collie: store_queue_request(181) 2, 1, /sheepdog/0/c0000, 3, 3 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 25 Jan 3 09:53:43 localhost collie: sd_deliver(490) op: 2, done: 1, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: __sd_deliver(437) op: 2, done: 1, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 8 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 8 Jan 3 09:53:43 localhost collie: cluster_queue_request(175) 0x188c4c0 19 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 8 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 8 Jan 3 09:53:43 localhost collie: cluster_queue_request(175) 0x18824b0 18 Jan 3 09:53:43 localhost collie: sd_deliver(490) op: 2, done: 0, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: __sd_deliver(437) op: 2, done: 0, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: lookup_vdi(158) looking for zopa2 5 Jan 3 09:53:43 localhost collie: listen_handler(337) accepted a new connection, 25 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 25 Jan 3 09:53:43 localhost collie: lookup_vdi(189) looking for zopa2 c0000 Jan 3 09:53:43 localhost collie: sd_deliver(490) op: 2, done: 1, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: __sd_deliver(437) op: 2, done: 1, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:43 localhost collie: client_handler(298) closed a connection, 8 Jan 3 09:53:44 localhost collie: listen_handler(337) accepted a new connection, 8 Jan 3 09:53:44 localhost collie: cluster_queue_request(175) 0x188c4c0 19 Jan 3 09:53:44 localhost collie: client_handler(298) closed a connection, 8 Jan 3 09:53:44 localhost collie: listen_handler(337) accepted a new connection, 8 Jan 3 09:53:44 localhost collie: cluster_queue_request(175) 0x18824b0 18 Jan 3 09:53:44 localhost collie: sd_deliver(490) op: 2, done: 0, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:44 localhost collie: __sd_deliver(437) op: 2, done: 0, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:44 localhost collie: lookup_vdi(158) looking for zopa2 5 Jan 3 09:53:44 localhost collie: listen_handler(337) accepted a new connection, 25 Jan 3 09:53:44 localhost collie: lookup_vdi(189) looking for zopa2 c0000 Jan 3 09:53:44 localhost collie: client_handler(298) closed a connection, 25 Jan 3 09:53:44 localhost collie: sd_deliver(490) op: 2, done: 1, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:44 localhost collie: __sd_deliver(437) op: 2, done: 1, size: 142, from: 192.1.1.4:7000 Jan 3 09:53:44 localhost collie: client_handler(298) closed a connection, 8 fire-srv3 ~ # fire-srv2 ~ # shepherd info -t vm Name |Vdi size |Allocated| Shared | Status ----------------+---------+---------+---------+------------ zopa | 5120 MB| 0 MB| 0 MB| not running zopa1 | 5120 MB| 0 MB| 0 MB| not running zopa2 | 5120 MB| 0 MB| 0 MB| not running fire-srv2 ~ # shepherd info -t sheep Id Size Used Use% 0 14G 0G 0% 1 14G 0G 0% 2 14G 0G 0% Total 44G 0G 0%, total virtual VDI Size 15G fire-srv2 ~ # fire-srv3 ~ # find /sheepdog/ /sheepdog/ /sheepdog/0 /sheepdog/0/vdi /sheepdog/0/vdi/zopa /sheepdog/0/vdi/zopa/0000000000040000-00000000 /sheepdog/0/vdi/zopa1 /sheepdog/0/vdi/zopa1/0000000000080000-00000000 /sheepdog/0/vdi/zopa2 /sheepdog/0/vdi/zopa2/00000000000c0000-00000000 /sheepdog/0/40000 /sheepdog/0/0 /sheepdog/0/c0000 fire-srv3 ~ # fire-srv2 ~ # find /sheepdog/ /sheepdog/ /sheepdog/0 /sheepdog/0/40000 /sheepdog/0/80000 /sheepdog/0/c0000 fire-srv2 ~ # fire-srv4 ~ # find /sheepdog/ /sheepdog/ /sheepdog/0 /sheepdog/0/vdi /sheepdog/0/vdi/zopa /sheepdog/0/vdi/zopa/0000000000040000-00000000 /sheepdog/0/vdi/zopa1 /sheepdog/0/vdi/zopa1/0000000000080000-00000000 /sheepdog/0/vdi/zopa2 /sheepdog/0/vdi/zopa2/00000000000c0000-00000000 /sheepdog/0/0 /sheepdog/0/80000 fire-srv4 ~ # fire-srv2 ~ # shepherd mkfs --copies=3 fire-srv2 ~ # fire-srv3 ~ # kvm-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa4 find_vdi_name 1041: Invalid error code, zopa4 find_vdi_name 1041: Invalid error code, zopa4 qemu-img: Could not open 'zopa4' fire-srv3 ~ # fire-srv2 ~ # find /sheepdog/ /sheepdog/ /sheepdog/0 /sheepdog/0/40000 /sheepdog/0/80000 /sheepdog/0/c0000 /sheepdog/0/vdi /sheepdog/0/vdi/zopa4 /sheepdog/0/vdi/zopa4/0000000000040000-00000000 fire-srv2 ~ # fire-srv3 ~ # find /sheepdog/ /sheepdog/ /sheepdog/0 /sheepdog/0/vdi /sheepdog/0/vdi/zopa /sheepdog/0/vdi/zopa/0000000000040000-00000000 /sheepdog/0/vdi/zopa1 /sheepdog/0/vdi/zopa1/0000000000080000-00000000 /sheepdog/0/vdi/zopa2 /sheepdog/0/vdi/zopa2/00000000000c0000-00000000 /sheepdog/0/vdi/zopa4 /sheepdog/0/vdi/zopa4/0000000000040000-00000000 /sheepdog/0/40000 /sheepdog/0/0 /sheepdog/0/c0000 fire-srv3 ~ # fire-srv4 ~ # find /sheepdog/ /sheepdog/ /sheepdog/0 /sheepdog/0/vdi /sheepdog/0/vdi/zopa /sheepdog/0/vdi/zopa/0000000000040000-00000000 /sheepdog/0/vdi/zopa1 /sheepdog/0/vdi/zopa1/0000000000080000-00000000 /sheepdog/0/vdi/zopa2 /sheepdog/0/vdi/zopa2/00000000000c0000-00000000 /sheepdog/0/vdi/zopa4 /sheepdog/0/vdi/zopa4/0000000000040000-00000000 /sheepdog/0/0 /sheepdog/0/80000 /sheepdog/0/40000 fire-srv4 ~ # After thios i've stoped all sheepdogs cleaned the /sheepdog filesystems started sheepdogs again and --copies=2 and now " kvm-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa5" worked ok So this problem is not always reproducible, but happens sometimes even on a clean start. Alex Piavlo wrote: >>> 7) After I created several images and stop sheepdog all nodes and >>> started them later again, all VMs can be listed but then I try to create >>> another image I get >>> >>> shell-srv1> kvm-img convert -f raw -O sheepdog /dev/sys/kvm-img foo >>> find_vdi_name 1041: Invalid error code, foo >>> find_vdi_name 1041: Invalid error code, foo >>> qemu-img: Could not open 'foo' >>> sheel-srv1> >>> >>> >> Sorry, I couldn't reproduce the problem. Please confirm that applying >> the patch and running make on all three nodes. If the problem is not >> resolved, would you send me collie logs in /var/log/syslog? >> > |