[sheepdog-users] [ANNOUNCE] sheepdog stable release v0.7.3-rc0

Gerald Richter - ECOS richter at ecos.de
Sun Sep 8 09:47:30 CEST 2013


Hi,

regarding the segfault I mentioned on Friday, there are two nodes, format was with copies = 3 and I am using corosync. I don't have much more information anymore, because I had to import data on the weekend. 

During the data import sheep crashed again. It crashed when I had two qemu-img convert running at the same time (of course with different vdi's) on the same machine. I still using 2 nodes. The network between the two node is only 100Mbit/s, so It got from time to time these poll timeout, but which wasn't a problem before. Here is the output of sheep.log (the second sheep on the machine were no import happen is still running).

Regards

Gerald

Sep 07 21:50:56   WARN [gway 130685] wait_forward_request(177) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Sep 07 21:51:00   WARN [gway 130746] wait_forward_request(177) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Sep 07 21:51:01   WARN [gway 130685] wait_forward_request(177) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Sep 07 21:51:01   WARN [gway 130200] wait_forward_request(177) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Sep 07 21:58:50  EMERG [gway 131160] crash_handler(250) sheep exits unexpectedly (Segmentation fault).
Sep 07 21:58:52  EMERG [gway 131160] sd_backtrace(843) sheep.c:252: crash_handler
Sep 07 21:58:52  EMERG [gway 131160] sd_backtrace(857) /lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7fda93b9d02f]
Sep 07 21:58:52  EMERG [gway 131160] sd_backtrace(857) /lib/x86_64-linux-gnu/libc.so.6(+0x83fa3) [0x7fda931fafa3]
Sep 07 21:58:52  EMERG [gway 131160] sd_backtrace(843) bitops.h:46: alloc_bitmap
Sep 07 21:58:52  EMERG [gway 131160] sd_backtrace(857) /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7fda93b94b4f]
Sep 07 21:58:52  EMERG [gway 131160] sd_backtrace(857) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7fda93251a7c]
Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #6  0x000000000041f4a5 in sd_backtrace () at logger.c:862

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) 862		dump_stack_frames();

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) addrs = {0x41f36d, 0x4058b8, 0x7fda93b9d030, 0x7fda931fafa4, 0x423faf, 0x7fda93b94b50, 0x7fda93251a7d, 0x0 <repeats 1017 times>}

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) i = <optimized out>

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) n = <optimized out>

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) __func__ = "sd_backtrace"

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #7  0x00000000004058b8 in crash_handler (signo=11) at sheep.c:252

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) 252		sd_backtrace();

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) __func__ = "crash_handler"

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #8  <signal handler called>

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) No symbol table info available.

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #9  0x00007fda931fafa4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) No symbol table info available.

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #10 0x0000000000423faf in alloc_bitmap (new_bits=262144, old_bits=<optimized out>, old_bmap=<optimized out>) at ../include/bitops.h:46

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) 46			memset(new_bmap + old_size, 0, new_size - old_size);

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) old_size = <optimized out>

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) new_size = 32768

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) new_bmap = <optimized out>

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #11 worker_routine (arg=0x1b90420) at work.c:264

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) 264			tid_map = alloc_bitmap(tid_map, old_tid_max, tid_max);

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) wi = 0x1b90420

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) work = <optimized out>

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) tid = 131160

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) __func__ = "worker_routine"

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #12 0x00007fda93b94b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) No symbol table info available.

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #13 0x00007fda93251a7d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) No symbol table info available.

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(790) #14 0x0000000000000000 in ?? ()

Sep 07 21:59:01  EMERG [gway 131160] dump_stack_frames(804) No symbol table info available.

Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(731) dump __sys
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(734) $1 = {cdrv = 0x63a520, cdrv_option = 0x0, this_node = {nid = {addr = '\000' <repeats 12 times>"\260, \to\222", port = 7000, io_addr = '\000' <repeats 15 times>, io_port = 0, pad = "\000\000\000"}, nr_vnodes = 64, zone = 2456750512, space = 4099542974464},
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739)  cinfo = {proto_ver = 8 '\b', disable_recovery = 0 '\000', nr_nodes = 2, epoch = 3, ctime = 5920862273014739504, flags = 1, nr_copies = 3 '\003', status = SD_STATUS_OK, __pad = 0, store = "plain\000\000\000\000\000\000\000\000\000\000", nodes = {{nid = {a
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739) ddr = '\000' <repeats 12 times>"\260, \to\222", port = 7000, io_addr = '\000' <repeats 15 times>, io_port = 0, pad = "\000\000\000"}, nr_vnodes = 56, zone = 2456750512, space = 4099542974464}, {nid = {addr = '\000' <repeats 12 times>"\260, \tzO", port = 7
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739) 000, io_addr = '\000' <repeats 15 times>, io_port = 0, pad = "\000\000\000"}, nr_vnodes = 72, zone = 1333397936, space = 5322119831552}, {nid = {addr = '\000' <repeats 15 times>, port = 0, io_addr = '\000' <repeats 15 times>, io_port = 0, pad = "\000\000\
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739) 000"}, nr_vnodes = 0, zone = 0, space = 0} <repeats 1022 times>}}, disk_space = 4099542974464, vdi_inuse = {0 <repeats 11459 times>, 9007199254740992, 0 <repeats 15087 times>, 1099511627776, 0 <repeats 74397 times>, 18014398509481984, 0 <repeats 1792 time
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739) s>, 262144, 0 <repeats 24409 times>, 137438953472, 0 <repeats 127514 times>, 131072, 0 <repeats 2506 times>, 281474976710656, 0 <repeats 4973 times>}, local_req_efd = 11, local_req_lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __ki
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739) nd = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, local_req_queue = {next = 0x84a848, prev = 0x84a848}, req_wait_queue = {next = 0x84a858, prev = 0x84a858}, nr_outstanding_reqs = 1, gateway_only
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739)  = false, nosync = false, gateway_wqueue = 0x1b904f0, io_wqueue = 0x1b90730, deletion_wqueue = 0x1b90bb0, recovery_wqueue = 0x1b90970, recovery_notify_wqueue = 0x0, block_wqueue = 0x1b90df0, oc_reclaim_wqueue = 0x1b91270, oc_push_wqueue = 0x1b8f5d0, md_wq
Sep 07 21:59:01  EMERG [gway 131160] __sd_dump_variable(739) ueue = 0x1b91030, enable_object_cache = true, object_cache_size = 100000, object_cache_directio = false, use_journal = {val = 1}, backend_dio = false, upgrade = false}

Sep 07 21:59:02  ERROR [main] crash_handler(490) sheep pid 109277 exited unexpectedly.

> -----Ursprüngliche Nachricht-----
> Von: sheepdog-users-bounces at lists.wpkg.org [mailto:sheepdog-users-
> bounces at lists.wpkg.org] Im Auftrag von Hitoshi Mitake
> Gesendet: Freitag, 6. September 2013 16:59
> An: sheepdog-users at lists.wpkg.org; sheepdog at lists.wpkg.org
> Betreff: [sheepdog-users] [ANNOUNCE] sheepdog stable release v0.7.3-rc0
> 
> Hi sheepdog users and developers,
> 
> I released v0.7.3-rc0 of stable branch. You can download a source archive
> from these URLs:
> tar.gz: https://github.com/sheepdog/sheepdog/archive/v0.7.3-rc0.tar.gz
> zip: https://github.com/sheepdog/sheepdog/archive/v0.7.3-rc0.zip
> 
> The most important updates of this release are:
>  - some bugfixes for vdi deletion process
>  - prevent losing vdi information at cluster initialization sequence
>  - remove possibility of segfault in main event loop
> 
> If no one complains about this release in 2 days, it will be v0.7.3 officialy.
> 
> Below is the summary of commits this release contains.
> 
> Hitoshi Mitake (5):
>       tests/functional: let check clean directories of passed tests in default
>       tests/functional: unmount loopback devices before cleaning directories
>       sheep: make the vid deletion proceduer correct order
>       sheep: initialize vdi bitmap after completion of reading inode object
>       sheep: set bit in vdi_inuse in atomic manner
> 
> MORITA Kazutaka (3):
>       sheep: don't remove vdi object from object list cache
>       sheep: wait until there is no get_vdis work in wait_get_vdis_done()
>       event: refresh event info after unregistering
> 
> Robin Dong (1):
>       fix error-io in sheepfs when using ext4 filesystem
> 
> 
> Thanks,
> Hitoshi
> --
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users




More information about the sheepdog-users mailing list