[sheepdog] [PATCH v2 0/3] sheep: writeback cache semantics in backend store

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Mon Sep 3 09:08:59 CEST 2012


(2012/09/03 15:51), Hitoshi Mitake wrote:
>
>>
>> I have got a core dump by following one liner:
>>
>> $ while (($?==0));do sudo ./check -corosync 26;done
>>
>> (gdb) bt
>> #0  0x00007fa3fc81dba5 in raise (sig=<value optimized out>) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:64
>> #1  0x00007fa3fc8216b0 in abort () at abort.c:92
>> #2  0x00007fa3fc85765b in __libc_message (do_abort=<value optimized
>> out>, fmt=<value optimized out>)
>>      at ../sysdeps/unix/sysv/linux/libc_fatal.c:189
>> #3  0x00007fa3fc8616d6 in malloc_printerr (action=3,
>> str=0x7fa3fc935758 "double free or corruption (!prev)",
>>      ptr=<value optimized out>) at malloc.c:6283
>> #4  0x00007fa3fc867ea3 in __libc_free (mem=<value optimized out>) at
>> malloc.c:3738
>> #5  0x0000000000407c88 in put_request (req=0x21b1c00) at request.c:513
>> #6  0x000000000040d083 in bs_thread_request_done (fd=<value optimized
>> out>, events=<value optimized out>, data=<value optimized out>)
>>      at work.c:137
>> #7  0x00000000004199c6 in event_loop (timeout=<value optimized out>)
>> at event.c:179
>> #8  0x000000000040437d in main (argc=<value optimized out>,
>> argv=0x7fffad763628) at sheep.c:453
>> (gdb) info threads
>>    4 Thread 8362  pthread_cond_wait@@GLIBC_2.3.2 () at
>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>    3 Thread 8361  pthread_cond_wait@@GLIBC_2.3.2 () at
>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>    2 Thread 8363  pthread_cond_wait@@GLIBC_2.3.2 () at
>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>> * 1 Thread 8333  0x00007fa3fc81dba5 in raise (sig=<value optimized
>> out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
>>
>>
>> When I run without your patch set, I don't meet this seg fault.
>>
>
> Thanks for your information. I could also reproduce test fail with 026.
> It seems to be a timing bug, I'm trying to debug it.
>

BTW, I have two questions.

How many iteration did it take to produce this segfault?
It seems that reproducing this segfault is hard, it happens rarely on my 
environment.

Did you modify test scripts?
If you didn't, it means that this segfault can happen when 
sys->store_writeback == 0. This will be helpful information for debugging.

Thanks,
Hitoshi





More information about the sheepdog mailing list