[sheepdog] sheep exited unexpectedly when disk is out of space

Hongyi Wang hongyi at zelin.io
Thu May 23 15:09:08 CEST 2013


Hi,

I did some I/O stress testing for sheepdog.
For each node, I started sheep in this command:
sheep -b 0.0.0.0 -y 10.0.0.XX -p 7000 -j dir=/sheep/journal size=3000 -D -w
size=40000 dir=/sheep/object_cache -c zookeeper:10.0.0.10:2181,timeout=30s
/sheep/state /sheep/disk1,/sheep/disk2 -P /sheep/state/sheep.pid

Notice: I used both /sheep/disk1 and /sheep/disk2

Before testing, the status of node looked like this:
> collie node info
Id Size Used Use%
0 0.0 MB 0.0 MB 0%
1 57 GB 14 GB 24%
2 136 GB 30 GB 22%
3 57 GB 16 GB 27%
Total 250 GB 60 GB 23%

I started a vm in node1 and performed sequential write on it (~30GB). The
sheep on the node1 exited unexpected when no space left on device. I tried
to check my node1 disk status:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg1-sys    99G   28G   67G  30% /
tmpfs                  16G     0   16G   0% /dev/shm
/dev/sda1             985M   45M  890M   5% /boot
/dev/mapper/vg1-home  2.0G   68M  1.9G   4% /home
/dev/mapper/vg1-sheep_disk1
                       40G   40G     0 100% /sheep/disk1
/dev/mapper/vg1-sheep_journal
                      4.0G  3.0G  793M  80% /sheep/journal
/dev/mapper/vg1-sheep_obj_cache
                       40G  420M   37G   2% /sheep/object_cache
/dev/mapper/vg1-var    32G  286M   30G   1% /var

As it shown , disk1 is mount a single exclusive partition, disk2 is assign
to "/" by default.
To my best knowledge, when disk1 was full, it should not raise an error and
make sheep exited (see attached log), since sheep was aware of the size of
disk1 and when it's full it should switch to write disk2. At least it
should not throw an error and caused an unexpected exit.

I wonder why this error occurs?

Thanks,

--Hongyi

================================================================
sheep.log on the node which sheep exited unexpectedly.

May 24 04:14:44 [io 17463] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:44 [io 17463] err_to_sderr(78) diskfull, oid=65e2f7000022a4
May 24 04:14:44 [io 17448] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:44 [io 17448] err_to_sderr(78) diskfull, oid=65e2f7000022a6
May 24 04:14:44 [gway 17015] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:44 [gway 17015] err_to_sderr(78) diskfull, oid=6ebf780000253b
May 24 04:14:44 [gway 17015] gateway_forward_request(305) fail to write
local 6ebf780000253b, Server has no space for new objects
May 24 04:14:44 [io 17446] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:44 [io 17446] err_to_sderr(78) diskfull, oid=65e2f7000022a5
May 24 04:14:45 [io 17472] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:45 [io 17472] err_to_sderr(78) diskfull, oid=65e2f7000022ad
May 24 04:14:46 [io 17469] prealloc(284) failed to preallocate space, No
space left on device
May 24 04:14:46 [io 17469] err_to_sderr(78) diskfull, oid=65e2f7000022ae
May 24 04:14:46 [io 17462] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:46 [io 17462] err_to_sderr(78) diskfull, oid=65e2f7000022aa
May 24 04:14:46 [gway 17356] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:46 [gway 17356] err_to_sderr(78) diskfull, oid=6ebf780000253f
May 24 04:14:46 [gway 17356] gateway_forward_request(305) fail to write
local 6ebf780000253f, Server has no space for new objects
May 24 04:14:46 [gway 17011] default_create_and_write(342) failed to write
object. No space left on device
May 24 04:14:46 [gway 17011] err_to_sderr(78) diskfull, oid=6ebf7800002540
May 24 04:14:46 [gway 17011] gateway_forward_request(305) fail to write
local 6ebf7800002540, Server has no space for new objects
May 24 04:14:46 [oc_push 16832] push_cache_object(467) failed to push
object Server has no space for new objects
May 24 04:14:46 [oc_push 16832] do_push_object(837) PANIC: push failed but
should never fail
May 24 04:14:46 [oc_push 16832] crash_handler(181) sheep exits unexpectedly
(Aborted).
May 24 04:14:46 [oc_push 16832] sd_backtrace(847) sheep() [0x4045b7]
May 24 04:14:46 [oc_push 16534] push_cache_object(467) failed to push
object Server has no space for new objects
May 24 04:14:46 [oc_push 16534] do_push_object(837) PANIC: push failed but
should never fail
May 24 04:14:52 [main] crash_handler(487) sheep pid 8130 exited
unexpectedly.
=====================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20130523/ffb0f5bb/attachment-0003.html>


More information about the sheepdog mailing list