[sheepdog] sheep exited unexpectedly when disk is out of space
Hongyi Wang
hongyi at zelin.io
Thu May 23 15:09:08 CEST 2013
Hi,
I did some I/O stress testing for sheepdog.
For each node, I started sheep in this command:
sheep -b 0.0.0.0 -y 10.0.0.XX -p 7000 -j dir=/sheep/journal size=3000 -D -w
size=40000 dir=/sheep/object_cache -c zookeeper:10.0.0.10:2181,timeout=30s
/sheep/state /sheep/disk1,/sheep/disk2 -P /sheep/state/sheep.pid
Notice: I used both /sheep/disk1 and /sheep/disk2
Before testing, the status of node looked like this:
> collie node info
Id Size Used Use%
0 0.0 MB 0.0 MB 0%
1 57 GB 14 GB 24%
2 136 GB 30 GB 22%
3 57 GB 16 GB 27%
Total 250 GB 60 GB 23%
I started a vm in node1 and performed sequential write on it (~30GB). The
sheep on the node1 exited unexpected when no space left on device. I tried
to check my node1 disk status:
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg1-sys 99G 28G 67G 30% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/sda1 985M 45M 890M 5% /boot
/dev/mapper/vg1-home 2.0G 68M 1.9G 4% /home
/dev/mapper/vg1-sheep_disk1
40G 40G 0 100% /sheep/disk1
/dev/mapper/vg1-sheep_journal
4.0G 3.0G 793M 80% /sheep/journal
/dev/mapper/vg1-sheep_obj_cache
40G 420M 37G 2% /sheep/object_cache
/dev/mapper/vg1-var 32G 286M 30G 1% /var
As it shown , disk1 is mount a single exclusive partition, disk2 is assign
to "/" by default.
To my best knowledge, when disk1 was full, it should not raise an error and
make sheep exited (see attached log), since sheep was aware of the size of
disk1 and when it's full it should switch to write disk2. At least it
should not throw an error and caused an unexpected exit.
I wonder why this error occurs?
Thanks,
--Hongyi
================================================================
sheep.log on the node which sheep exited unexpectedly.
May 24 04:14:44 [io 17463] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:44 [io 17463] err_to_sderr(78) diskfull, oid=65e2f7000022a4
May 24 04:14:44 [io 17448] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:44 [io 17448] err_to_sderr(78) diskfull, oid=65e2f7000022a6
May 24 04:14:44 [gway 17015] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:44 [gway 17015] err_to_sderr(78) diskfull, oid=6ebf780000253b
May 24 04:14:44 [gway 17015] gateway_forward_request(305) fail to write
local 6ebf780000253b, Server has no space for new objects
May 24 04:14:44 [io 17446] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:44 [io 17446] err_to_sderr(78) diskfull, oid=65e2f7000022a5
May 24 04:14:45 [io 17472] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:45 [io 17472] err_to_sderr(78) diskfull, oid=65e2f7000022ad
May 24 04:14:46 [io 17469] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:46 [io 17469] err_to_sderr(78) diskfull, oid=65e2f7000022ae
May 24 04:14:46 [io 17462] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:46 [io 17462] err_to_sderr(78) diskfull, oid=65e2f7000022aa
May 24 04:14:46 [gway 17356] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:46 [gway 17356] err_to_sderr(78) diskfull, oid=6ebf780000253f
May 24 04:14:46 [gway 17356] gateway_forward_request(305) fail to write
local 6ebf780000253f, Server has no space for new objects
May 24 04:14:46 [gway 17011] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:46 [gway 17011] err_to_sderr(78) diskfull, oid=6ebf7800002540
May 24 04:14:46 [gway 17011] gateway_forward_request(305) fail to write
local 6ebf7800002540, Server has no space for new objects
May 24 04:14:46 [oc_push 16832] push_cache_object(467) failed to push
object Server has no space for new objects
May 24 04:14:46 [oc_push 16832] do_push_object(837) PANIC: push failed but
should never fail
May 24 04:14:46 [oc_push 16832] crash_handler(181) sheep exits unexpectedly
(Aborted).
May 24 04:14:46 [oc_push 16832] sd_backtrace(847) sheep() [0x4045b7]
May 24 04:14:46 [oc_push 16534] push_cache_object(467) failed to push
object Server has no space for new objects
May 24 04:14:46 [oc_push 16534] do_push_object(837) PANIC: push failed but
should never fail
May 24 04:14:52 [main] crash_handler(487) sheep pid 8130 exited
unexpectedly.
=====================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20130523/ffb0f5bb/attachment-0003.html>
More information about the sheepdog
mailing list