[sheepdog-users] questions regarding node failure scenario

Wed Jan 14 11:06:40 CET 2015

Hi again,

I setup a test cluster (9.1 compiled from sources with zookeeper) of 6 machines and simulated a simple node failure. For
that I simply put a single VDI in the cluster, disabled recovery (to see how sheepdog behaves while nodes/ disks are
down), started qemu and then killed one sheep process with "kill PID" (the one which is also acting as a gateway for the
local qemu process). Here's what happened:

- qemu died

- cluster info
Cluster status: running, auto-recovery disabled
Cluster created at Tue Jan  6 00:04:29 2015
Epoch Time           Version
2015-01-06 22:54:53      3 [10.0.0.4:7000, 10.0.0.5:7000, 10.0.0.6:7000, 10.0.0.7:7000, 10.0.0.8:7000, 10.0.0.8:7001]
2015-01-06 22:54:20      2 [10.0.0.4:7000, 10.0.0.6:7000, 10.0.0.7:7000, 10.0.0.8:7000, 10.0.0.8:7001]
2015-01-06 00:04:29      1 [10.0.0.4:7000, 10.0.0.5:7000, 10.0.0.6:7000, 10.0.0.7:7000, 10.0.0.8:7000, 10.0.0.8:7001]

- logs on node where sheep was killed (10.0.0.5) shows this
Jan 06 22:17:44   INFO [main] vdi_lock(450) VDI d63538 is locked
Jan 06 22:53:49   INFO [main] rx_main(830) req=0x7fabe0000e80, fd=338, client=127.0.0.1:49990, op=DISABLE_RECOVER,
data=(null)
Jan 06 22:53:49   INFO [main] tx_main(882) req=0x7fabe0000e80, fd=338, client=127.0.0.1:49990, op=DISABLE_RECOVER, result=00
Jan 06 22:54:20   INFO [main] main(966) shutdown
Jan 06 22:54:20   INFO [main] zk_leave(989) leaving from cluster
Jan 06 22:54:52   INFO [main] md_add_disk(343) /xfs-drive1/sheepdog/obj, vdisk nr 359, total disk 1
Jan 06 22:54:52   INFO [main] send_join_request(1032) IPv4 ip:10.0.0.5 port:7000 going to join the cluster
Jan 06 22:54:52 NOTICE [main] nfs_init(608) nfs server service is not compiled
Jan 06 22:54:52   INFO [main] check_host_env(500) Allowed open files 65535, suggested 6144000
Jan 06 22:54:52   INFO [main] main(958) sheepdog daemon (version 0.9.1) started
Jan 06 22:54:53  ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 10.0.0.4:7000, op
name: VDI_STATE_SNAPSHOT_CTL
Jan 06 22:54:53  ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 10.0.0.4:7000, op
name: VDI_STATE_SNAPSHOT_CTL
Jan 06 22:54:53  ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 10.0.0.4:7000, op
name: VDI_STATE_SNAPSHOT_CTL
Jan 06 22:54:53  ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 10.0.0.4:7000, op
name: VDI_STATE_SNAPSHOT_CTL
Jan 06 22:56:13   INFO [main] cluster_lock_vdi_main(1349) node: IPv4 ip:10.0.0.5 port:7000 is locking VDI (type:
normal): d63538
Jan 06 22:56:13   INFO [main] vdi_lock(450) VDI d63538 is locked
Jan 06 22:56:13  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.4:7000, op name: READ_PEER
Jan 06 22:56:13  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.6:7000, op name: READ_PEER
Jan 06 22:56:13  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.7:7000, op name: READ_PEER
Jan 06 22:56:13  ERROR [rw 22347] sheep_exec_req(1170) failed No object found, remote address: 10.0.0.8:7000, op name:
READ_PEER
Jan 06 22:56:13  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.8:7001, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.4:7000, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.6:7000, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.7:7000, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed No object found, remote address: 10.0.0.8:7000, op name:
READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.8:7001, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.4:7000, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.6:7000, op name: READ_PEER
Jan 06 22:56:16  ERROR [rw 22347] sheep_exec_req(1170) failed Network error between sheep, remote address:
10.0.0.7:7000, op name: READ_PEER
...

- logs on 10.0.0.4 show this
Jan 06 22:17:44   INFO [main] vdi_lock(450) VDI d63538 is locked
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 1
Jan 06 22:54:53   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 512, required length: 4260
Jan 06 22:54:53  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 1
Jan 06 22:54:53   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 1024, required length: 4260
Jan 06 22:54:53  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 1
Jan 06 22:54:53   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 2048, required length: 4260
Jan 06 22:54:53  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 1
Jan 06 22:54:53   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 4096, required length: 4260
Jan 06 22:54:53  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 1
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) freeing vdi state snapshot at epoch 1
Jan 06 22:56:13   INFO [main] cluster_lock_vdi_main(1349) node: IPv4 ip:10.0.0.5 port:7000 is locking VDI (type:
normal): d63538
Jan 06 22:56:13   INFO [main] vdi_lock(454) VDI d63538 is already locked
Jan 06 22:56:13  ERROR [main] cluster_lock_vdi_main(1352) locking d63538failed
Jan 06 22:56:13  ERROR [io 30123] err_to_sderr(101) /xfs-drive1/sheepdog/obj/.stale corrupted
...
- logs on all other nodes show this
Jan 06 22:17:44   INFO [main] vdi_lock(450) VDI d63538 is locked
Jan 06 22:54:53   INFO [main] local_vdi_state_snapshot_ctl(1388) freeing vdi state snapshot at epoch 1
Jan 06 22:56:13   INFO [main] cluster_lock_vdi_main(1349) node: IPv4 ip:10.0.0.5 port:7000 is locking VDI (type:
normal): d63538
Jan 06 22:56:13   INFO [main] vdi_lock(454) VDI d63538 is already locked
Jan 06 22:56:13  ERROR [main] cluster_lock_vdi_main(1352) locking d63538failed
Jan 06 22:56:13  ERROR [io 11612] err_to_sderr(101) /xfs-drive1/sheepdog/obj/.stale corrupted
...

I wonder why all nodes (except the one where I killed sheep) show "/xfs-drive1/sheepdog/obj/.stale corrupted" errors in
their logs?

Why does the failed node after restart output messages like "The buffer is too small"?

Thank you in advance!

Corin