Hi, I did some I/O stress testing for sheepdog. For each node, I started sheep in this command: sheep -b 0.0.0.0 -y 10.0.0.XX -p 7000 -j dir=/sheep/journal size=3000 -D -w size=40000 dir=/sheep/object_cache -c zookeeper:10.0.0.10:2181,timeout=30s /sheep/state /sheep/disk1,/sheep/disk2 -P /sheep/state/sheep.pid Notice: I used both /sheep/disk1 and /sheep/disk2 Before testing, the status of node looked like this: > collie node info Id Size Used Use% 0 0.0 MB 0.0 MB 0% 1 57 GB 14 GB 24% 2 136 GB 30 GB 22% 3 57 GB 16 GB 27% Total 250 GB 60 GB 23% I started a vm in node1 and performed sequential write on it (~30GB). The sheep on the node1 exited unexpected when no space left on device. I tried to check my node1 disk status: # df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg1-sys 99G 28G 67G 30% / tmpfs 16G 0 16G 0% /dev/shm /dev/sda1 985M 45M 890M 5% /boot /dev/mapper/vg1-home 2.0G 68M 1.9G 4% /home /dev/mapper/vg1-sheep_disk1 40G 40G 0 100% /sheep/disk1 /dev/mapper/vg1-sheep_journal 4.0G 3.0G 793M 80% /sheep/journal /dev/mapper/vg1-sheep_obj_cache 40G 420M 37G 2% /sheep/object_cache /dev/mapper/vg1-var 32G 286M 30G 1% /var As it shown , disk1 is mount a single exclusive partition, disk2 is assign to "/" by default. To my best knowledge, when disk1 was full, it should not raise an error and make sheep exited (see attached log), since sheep was aware of the size of disk1 and when it's full it should switch to write disk2. At least it should not throw an error and caused an unexpected exit. I wonder why this error occurs? Thanks, --Hongyi ================================================================ sheep.log on the node which sheep exited unexpectedly. May 24 04:14:44 [io 17463] prealloc(284) failed to preallocate space, No space left on device May 24 04:14:44 [io 17463] err_to_sderr(78) diskfull, oid=65e2f7000022a4 May 24 04:14:44 [io 17448] prealloc(284) failed to preallocate space, No space left on device May 24 04:14:44 [io 17448] err_to_sderr(78) diskfull, oid=65e2f7000022a6 May 24 04:14:44 [gway 17015] default_create_and_write(342) failed to write object. No space left on device May 24 04:14:44 [gway 17015] err_to_sderr(78) diskfull, oid=6ebf780000253b May 24 04:14:44 [gway 17015] gateway_forward_request(305) fail to write local 6ebf780000253b, Server has no space for new objects May 24 04:14:44 [io 17446] default_create_and_write(342) failed to write object. No space left on device May 24 04:14:44 [io 17446] err_to_sderr(78) diskfull, oid=65e2f7000022a5 May 24 04:14:45 [io 17472] prealloc(284) failed to preallocate space, No space left on device May 24 04:14:45 [io 17472] err_to_sderr(78) diskfull, oid=65e2f7000022ad May 24 04:14:46 [io 17469] prealloc(284) failed to preallocate space, No space left on device May 24 04:14:46 [io 17469] err_to_sderr(78) diskfull, oid=65e2f7000022ae May 24 04:14:46 [io 17462] default_create_and_write(342) failed to write object. No space left on device May 24 04:14:46 [io 17462] err_to_sderr(78) diskfull, oid=65e2f7000022aa May 24 04:14:46 [gway 17356] default_create_and_write(342) failed to write object. No space left on device May 24 04:14:46 [gway 17356] err_to_sderr(78) diskfull, oid=6ebf780000253f May 24 04:14:46 [gway 17356] gateway_forward_request(305) fail to write local 6ebf780000253f, Server has no space for new objects May 24 04:14:46 [gway 17011] default_create_and_write(342) failed to write object. No space left on device May 24 04:14:46 [gway 17011] err_to_sderr(78) diskfull, oid=6ebf7800002540 May 24 04:14:46 [gway 17011] gateway_forward_request(305) fail to write local 6ebf7800002540, Server has no space for new objects May 24 04:14:46 [oc_push 16832] push_cache_object(467) failed to push object Server has no space for new objects May 24 04:14:46 [oc_push 16832] do_push_object(837) PANIC: push failed but should never fail May 24 04:14:46 [oc_push 16832] crash_handler(181) sheep exits unexpectedly (Aborted). May 24 04:14:46 [oc_push 16832] sd_backtrace(847) sheep() [0x4045b7] May 24 04:14:46 [oc_push 16534] push_cache_object(467) failed to push object Server has no space for new objects May 24 04:14:46 [oc_push 16534] do_push_object(837) PANIC: push failed but should never fail May 24 04:14:52 [main] crash_handler(487) sheep pid 8130 exited unexpectedly. ===================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20130523/ffb0f5bb/attachment.html> |