[sheepdog-users] Testing snapshot-object-reclaim

Valerio Pachera sirio81 at gmail.com
Thu Feb 27 11:39:24 CET 2014


Note: I notice now that I got 3 different version of sheep even tough
I used the same procedure for updating sheep:

make uninstall
make clean
git pull
./autogen
./configure --enable-zookeer
make -j 2 install

Sheepdog daemon version 0.8.0_114_gaec303c
Sheepdog daemon version 0.8.0_114_gfee7879
Sheepdog daemon version 0.8.0_114_g6e50ca7

I got a cluster creash writing data to my vdi and using snapshot.
I'm going to repeat that having the exact same version of sheep.

About the crash, if may interest:

On all 3 nodes of the cluster:

dog cluster info
Cluster status: running, auto-recovery enabled
Cluster created at Thu Feb 27 10:23:30 2014
Epoch Time           Version
2014-02-27 10:58:38      3 [192.168.10.5:7000]
2014-02-27 10:58:38      2 [192.168.10.4:7000, 192.168.10.5:7000]
2014-02-27 10:23:31      1 [192.168.10.4:7000, 192.168.10.5:7000,
192.168.10.6:7000]

On test004, sheep.log

Feb 27 10:58:37  ERROR [io 9281] get_erasure_index(37) failed to
getxattr /mnt/sheep/dsk02/087c2b260000009b, No data available
Feb 27 10:58:37  ERROR [io 9281] err_to_sderr(125)
oid=87c2b260000009b, No data available
Feb 27 10:58:37   INFO [main] md_remove_disk(323) /mnt/sheep/dsk02
from multi-disk array
Feb 27 10:58:37  ERROR [main] check_request_epoch(151) old node
version 2, 1 (CREATE_AND_WRITE_PEER)
Feb 27 10:58:37  ALERT [rw 7208] get_vdi_copy_number(100) copy number
for 5ddf88 not found, set 3
Feb 27 10:58:37  ALERT [rw 7208] get_vdi_copy_number(100) copy number
for 5ddf88 not found, set 3
Feb 27 10:58:37  ALERT [rw 7208] get_vdi_copy_number(100) copy number
for 5ddf88 not found, set 3
Feb 27 10:58:37  ALERT [rw 7208] get_vdi_copy_number(100) copy number
for 5ddf88 not found, set 3
Feb 27 10:58:37  ALERT [rw 7208] get_vdi_copy_number(100) copy number
for 5ddf88 not found, set 3

On test005, sheep.loh

Feb 27 10:58:37  ERROR [io 9243] get_erasure_index(37) failed to
getxattr /mnt/sheep/dsk02/087c2b260000009b, No data available
Feb 27 10:58:37  ERROR [io 9243] err_to_sderr(125)
oid=87c2b260000009b, No data available
Feb 27 10:58:37   INFO [main] md_remove_disk(323) /mnt/sheep/dsk02
from multi-disk array
Feb 27 10:58:37  ERROR [main] get_erasure_index(37) failed to getxattr
/mnt/sheep/dsk01/087c2b250000022a, No data available
Feb 27 10:58:37  ERROR [main] get_erasure_index(37) failed to getxattr
/mnt/sheep/dsk01/087c2b2500000220, No data available
Feb 27 10:58:37  ERROR [main] get_erasure_index(37) failed to getxattr
/mnt/sheep/dsk01/087c2b2500000009, No data available
Feb 27 10:58:37  ERROR [main] get_erasure_index(37) failed to getxattr
/mnt/sheep/dsk01/087c2b2500000134, No data available


On test006 (the node running qemu)

Feb 27 10:58:37  ERROR [io 8563] get_erasure_index(37) failed to
getxattr /mnt/sheep/dsk01/087c2b260000009b, No data available
Feb 27 10:58:37  ERROR [io 8563] err_to_sderr(125)
oid=87c2b260000009b, No data available
Feb 27 10:58:37   INFO [main] md_remove_disk(323) /mnt/sheep/dsk01
from multi-disk array
Feb 27 10:58:37  ERROR [gway 8547] wait_forward_request(464) fail
87c2b260000009b, Network error between sheep
Feb 27 10:58:37  ERROR [io 8568] err_to_sderr(107)
/all/disks/are/broken/,ps/əʌo7/! corrupted
Feb 27 10:58:37  ERROR [main] io_op_done(48) leaving sheepdog cluster
Feb 27 10:58:37   INFO [main] zk_leave(955) leaving from cluster
Feb 27 10:58:37  ERROR [gway 8547] wait_forward_request(464) fail
87c2b260000009b, Network error between sheep
Feb 27 10:58:38  ERROR [gway 8547] wait_forward_request(464) fail
87c2b260000009b, Network error between sheep
Feb 27 10:58:38  ERROR [gway 8533] wait_forward_request(464) fail
807c2b2800000000, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8522] wait_forward_request(464) fail
7c2b28000000ab, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8533] wait_forward_request(464) fail
807c2b2800000000, Network error between sheep
Feb 27 10:58:38  ERROR [gway 8522] wait_forward_request(464) fail
7c2b28000000ab, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8499] wait_forward_request(464) fail
7c2b28000000ab, Network error between sheep
Feb 27 10:58:38  ERROR [gway 8543] wait_forward_request(464) fail
7c2b28000000aa, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8550] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:58:38  ERROR [gway 8557] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:58:38  ERROR [gway 8549] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:58:38  ERROR [gway 8466] wait_forward_request(464) fail
87c2b260000009b, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8532] wait_forward_request(464) fail
807c2b2800000000, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8543] wait_forward_request(464) fail
7c2b28000000aa, Request has an old epoch
Feb 27 10:58:38  ERROR [gway 8544] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:58:38  ERROR [gway 8551] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:58:38  ERROR [gway 8524] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:58:38  ERROR [gway 8558] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:58:38  ERROR [gway 8553] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:58:43  ERROR [gway 8520] wait_forward_request(464) fail
807c2b2800000000, No object found
Feb 27 10:58:43  ERROR [gway 8556] sd_write_object(395) failed to
write object 807c2b2800000000, No object found
Feb 27 10:58:43  ERROR [gway 8501] wait_forward_request(464) fail
807c2b2800000000, No object found
Feb 27 10:58:43  ERROR [gway 8559] sd_write_object(395) failed to
write object 807c2b2800000000, No object found
Feb 27 10:59:08  ERROR [gway 8545] do_read(236) failed to read from
socket: -1, Resource temporarily unavailable
Feb 27 10:59:08  ERROR [gway 8545] exec_req(347) failed to read a response
Feb 27 10:59:08  ERROR [gway 8545] sheep_exec_req(1096) failed Request
has an old epoch
Feb 27 10:59:08  ERROR [gway 8555] sheep_exec_req(1096) failed No object found
Feb 27 10:59:08  ERROR [gway 8529] read_backend_object(414) failed to
read object 807c2b2800000000, No object found
Feb 27 10:59:08  ERROR [gway 8529] prepare_obj_refcnt(592) failed to
read vdi, 807c2b2800000000
Feb 27 10:59:09  ERROR [gway 8536] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:59:09  ERROR [gway 8503] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:59:09  ERROR [gway 8534] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:59:09  ERROR [gway 8552] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:59:09  ERROR [gway 8542] gateway_forward_request(531) There
isn't enough copies(1) to send out (3)
Feb 27 10:59:09  ERROR [gway 8516] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:59:09  ERROR [gway 8519] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
Feb 27 10:59:09  ERROR [gway 8560] gateway_forward_request(531) There
isn't enough copies(1) to send out (2)
...
...
...



More information about the sheepdog-users mailing list