[sheepdog-users] monitor cluster to avoid corruption
Valerio Pachera
sirio81 at gmail.com
Tue Dec 18 15:08:02 CET 2012
2012/12/18 Liu Yuan <namei.unix at gmail.com>:
> I think this is not easy to solve with a small change.
Well, I'm going to add more trouble :-)
During the performance tests of the other thread, my guest crushed 3
or more times.
I've been reproducing the problem and I describe it.
Some days ago I tried to write some data on the guest;
I took note of the percentage 'cluster node info';
I deleteed the data and wrote some other (less or the same amount).
Percentage of use didn't change, because once allocated the space, it
uses it and it doesn't allocate more.
During my performance test of today, I've been doing the same thing, but
*when I run dd the second time (on the same file), sheep daemon goes
crazy on my first node.*
The first node is the one running kvm.
The sheep daemon uses "all" the cpu (180%);
it's not in the cluster anymore;
it's impossible to interact with the guest (kvm uses almost no cpu)
(My cluster size is still the same)
*This is the situation after writing the 512M*
Looking at the percentage I deduce I'm able to write 512M more.
collie node list
M Id Host:Port V-Nodes Zone
- 0 192.168.2.41:7000 16 688040128
- 1 192.168.2.42:7000 16 704817344
- 2 192.168.2.43:7000 161 721594560
collie node info
Id Size Used Use%
0 982 MB 320 MB 32%
1 982 MB 228 MB 23%
2 10.0 GB 540 MB 5%
Total 12 GB 1.1 GB 8%
Total virtual image size 10 GB
*This is the situation rewriting the same data*
collie node list
M Id Host:Port V-Nodes Zone
- 0 192.168.2.42:7000 11 704817344
- 1 192.168.2.43:7000 117 721594560
collie node info
Id Size Used Use%
0 982 MB 660 MB 67% (Note: this was 55% right after the freeze)
1 10.0 GB 660 MB 6%
Total 11 GB 1.3 GB 11%
Total virtual image size 10 GB
sheep.log
Dec 18 14:32:21 [block] do_lookup_vdi(393) looking for test (7c2b25)
Dec 18 14:34:56 [gway 805] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [gway 806] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [main] gateway_op_done(100) leaving sheepdog cluster
Dec 18 14:34:56 [gway 807] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [gway 808] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [gway 809] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [main] gateway_op_done(100) leaving sheepdog cluster
Dec 18 14:34:56 [gway 811] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [rw 810] get_vdi_copy_number(82) No VDI copy entry for 0 found
Dec 18 14:34:56 [rw 810] screen_object_list(545) ERROR: can not find
copy number for object fc310
Dec 18 14:34:56 [rw 810] get_vdi_copy_number(82) No VDI copy entry for 0 found
Dec 18 14:34:56 [rw 810] screen_object_list(545) ERROR: can not find
copy number for object 57
Dec 18 14:34:56 [gway 814] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:56 [gway 815] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [main] queue_cluster_request(315) COMPLETE_RECOVERY (0xa921b0)
Dec 18 14:34:57 [gway 817] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 820] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 821] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 822] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 824] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 825] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 826] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 830] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 831] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:34:57 [gway 832] prealloc(303) failed to preallocate space,
No space left on device
Dec 18 14:35:01 [gway 836] prealloc(303) failed to preallocate space,
No space left on device
(END)
I have to
kill -9 <pid of kvm>
The disk seems to be corrupted
collie vdi check test
Failed to read, No object found
sheep.log
Dec 18 14:34:58 [main] queue_cluster_request(315) COMPLETE_RECOVERY (0x2029df0)
Dec 18 14:45:41 [main] queue_cluster_request(315) LOCK_VDI (0x7f12e00008e0)
Dec 18 14:45:41 [block] do_lookup_vdi(393) looking for test (7c2b25)
Dec 18 15:01:46 [main] queue_cluster_request(315) LOCK_VDI (0x7f12e0000a00)
Dec 18 15:01:46 [block] do_lookup_vdi(393) looking for test (7c2b25)
0.5.5_6_gb3f888b
More information about the sheepdog-users
mailing list