[sheepdog-users] [master branch] SIGABRT when doing: dog vdi check

Marcin Mirosław marcin at mejor.pl
Tue Jan 7 15:40:44 CET 2014


W dniu 07.01.2014 14:38, Liu Yuan pisze:
> On Tue, Jan 07, 2014 at 01:29:40PM +0100, Marcin Mirosław wrote:
>> W dniu 07.01.2014 12:50, Liu Yuan pisze:
>>> On Tue, Jan 07, 2014 at 11:14:09AM +0100, Marcin Mirosław wrote:
>>>> W dniu 07.01.2014 11:05, Liu Yuan pisze:
>>>>> On Tue, Jan 07, 2014 at 10:51:18AM +0100, Marcin Mirosław wrote:
>>>>>> W dniu 07.01.2014 03:00, Liu Yuan pisze:
>>>>>>> On Mon, Jan 06, 2014 at 05:38:41PM +0100, Marcin Mirosław wrote:
>>>>>>>> W dniu 2014-01-06 08:27, Liu Yuan pisze:
>>>>>>>>> On Sat, Jan 04, 2014 at 04:13:27PM +0100, Marcin Mirosław wrote:
>>>>>>>>>> W dniu 2014-01-04 06:28, Liu Yuan pisze:
>>>>>>>>>>> On Fri, Jan 03, 2014 at 10:51:26PM +0100, Marcin Mirosław wrote:
>>>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> Hi all!
>>>>>>>>>>
>>>>>>>>>>>> I'm new on "sheep-run";) I'm starting to try sheepdog so probably
>>>>>>>>>>>> I'm doing many things wrongly. I'm playing with sheepdog-0.7.6.
>>>>>>>>>>>>
>>>>>>>>>>>> First problem (SIGABRT): I started multi sheep daemeon on
>>>>>>>>>>>> localhost: # for x in 0 1 2 3 4; do sheep -c local -j size=128M
>>>>>>>>>>>> -p 700$x /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
>>>>>>>>>>>>
>>>>>>>>>>>> Next: # dog cluster info Cluster status: Waiting for cluster to
>>>>>>>>>>>> be formatted
>>>>>>>>>>>>
>>>>>>>>>>>> # dog cluster format -c 2:1
>>>>>>>>>>>
>>>>>>>>>>> 0.7.6 doesn't support erasure code. Try latest master branch
>>>>>>>>>>
>>>>>>>>>> Now I'm on 486ace8ccbb [master]. How I should check choosen redundancy?
>>>>>>>>>>  # cat /mnt/test/vdi/list
>>>>>>>>>>    Name        Id    Size    Used  Shared    Creation time   VDI id
>>>>>>>>>> Copies  Tag
>>>>>>>>>>    testowy      0  1.0 GB  0.0 MB  0.0 MB 2014-01-04 15:07   cac836     3
>>>>>>>>>>
>>>>>>>>>> Here I can see 3 copies, can't see info about how many parity strips
>>>>>>>>>> is configured. Probably this isn't implemented yet?
>>>>>>>>>
>>>>>>>>> Not yet. But currently you can 'dog cluster info -s' to see the global policy
>>>>>>>>> scheme x:y (that you 'dog cluster format -c x:y').
>>>>>>>>>
>>>>>>>>> With erasure coding, 'copies' will have another meaning that the number of total
>>>>>>>>> data + parity objects. In your case, it is 2+1=3. But as you said, this is
>>>>>>>>> confusing, I think of adding a extra field to indicate redundancy scheme per vid.
>>>>>>>>>
>>>>>>>>> Well, for about issue, I can't reproduce it. Could you give me more envronment
>>>>>>>>> information such as 32 or 64 bits of your OS? what is your distro?
>>>>>>>>
>>>>>>>> Hi!
>>>>>>>> I'm using Gentoo 64bits, gcc version 4.7.3 (Gentoo Hardened 4.7.3-r1
>>>>>>>> p1.4, pie-0.5.5), kernel 3.10 with Gentoo patches.
>>>>>>>>
>>>>>>>
>>>>>>> Does the problem still exist? I can't reproduce the issue yet. So how did you
>>>>>>> reproduce it step by step?
>>>>>>
>>>>>> Hi!
>>>>>> I'm installing sheepdog-0.7.x, next:
>>>>>> # mkdir -p /mnt/sheep/{metadata,storage}
>>>>>> # for x in 0 1 2 3 4; do sheep -c local -j size=128M -p 700$x
>>>>>> /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
>>>>>> # dog cluster format -c 2
>>>>>> using backend plain store
>>>>>> # dog vdi create testowy 5G
>>>>>> # dog  vdi check testowy
>>>>>> PANIC: can't find next new idx
>>>>>> dog exits unexpectedly (Aborted).
>>>>>> dog() [0x4058da]
>>>>>> [...]
>>>>>>
>>>>>> I'm getting SIGABRT on every try.
>>>>>>
>>>>>>
>>>>>
>>>>> On the same machine, with master branch(not stable-0.7), you mentioned you can't
>>>>> reproduce the problem?
>>>>
>>>> With master branch (commit  a79e69f9ad9c5) I'm getting such message:
>>>> # dog  vdi check testowy
>>>> PANIC: can't find a valid vnode
>>>> dog exits unexpectedly (Aborted).
>>>> dog() [0x4057fa]
>>>> /lib64/libpthread.so.0(+0xfd8f) [0x7f6d43cd0d8f]
>>>> /lib64/libc.so.6(gsignal+0x38) [0x7f6d43951368]
>>>> /lib64/libc.so.6(abort+0x147) [0x7f6d439526c7]
>>>> dog() [0x40336e]
>>>> dog() [0x409d9f]
>>>> dog() [0x40cea5]
>>>> dog() [0x403927]
>>>> /lib64/libc.so.6(__libc_start_main+0xf4) [0x7f6d4393dc04]
>>>> dog() [0x403c6c]
>>>>
>>>> Will be full gdb backtrace usefull?
>>>
>>> Hmm, before you run 'dog vdi check', what is output of 'dog cluster info',
>>> 'dog node list', 'dog node md info --all'?
>>
>> Output using master branch:
>> # dog cluster info
>> Cluster status: running, auto-recovery enabled
>>
>> Cluster created at Tue Jan  7 13:21:53 2014
>>
>> Epoch Time           Version
>> 2014-01-07 13:21:54      1 [127.0.0.1:7000, 127.0.0.1:7001,
>> 127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
>>
>> # dog node list
>>   Id   Host:Port         V-Nodes       Zone
>>    0   127.0.0.1:7000           128   16777343
>>    1   127.0.0.1:7001           128   16777343
>>    2   127.0.0.1:7002           128   16777343
>>    3   127.0.0.1:7003           128   16777343
>>    4   127.0.0.1:7004           128   16777343
>>
>> # dog node md info --all
>> Id      Size    Used    Avail   Use%    Path
>> Node 0:
>>  0      4.4 GB  4.0 MB  4.4 GB    0%    /mnt/sheep/storage/0
>> Node 1:
>>  0      4.4 GB  0.0 MB  4.4 GB    0%    /mnt/sheep/storage/1
>> Node 2:
>>  0      4.4 GB  0.0 MB  4.4 GB    0%    /mnt/sheep/storage/2
>> Node 3:
>>  0      4.4 GB  0.0 MB  4.4 GB    0%    /mnt/sheep/storage/3
>> Node 4:
>>  0      4.4 GB  0.0 MB  4.4 GB    0%    /mnt/sheep/storage/4
>>
> 
> The very strange thing from your output is that only 1 copy was actually
> written while you execute 'dog vdi create', but you formated the cluster with
> two copy specified.
> 
> You can verify this by
> 
> ls /mnt/sheepdog/storage/*/
> 
> I guess you can only see one object. Dunno why this happened.

It is as you said:
# ls /mnt/sheep/storage/*/
/mnt/sheep/storage/0/:
80cac83600000000

/mnt/sheep/storage/1/:

/mnt/sheep/storage/2/:

/mnt/sheep/storage/3/:

/mnt/sheep/storage/4/:


Now I'm on commit a79e69f9ad9c and problem still exists for me (in
contrary to 0.7-stable). I noticed that in my /tmp appeared file
"sheepdog_shm" and "lock" . Is it correct?

And I'm attaching the sheep's logs:
> # cat /mnt/sheep/metadata/*/sheep.log
> Jan 07 15:20:12   INFO [main] md_add_disk(311) /mnt/sheep/storage/0, vdisk nr 6, total disk 1
> Jan 07 15:20:12   INFO [main] send_join_request(781) IPv4 ip:127.0.0.1 port:7000
> Jan 07 15:20:12   INFO [main] check_host_env(478) Allowed open files 1000000, suggested 6144000
> Jan 07 15:20:12   INFO [main] main(882) sheepdog daemon (version 0.7.50) started
> Jan 07 15:20:12   INFO [main] md_add_disk(311) /mnt/sheep/storage/1, vdisk nr 6, total disk 1
> Jan 07 15:20:12   INFO [main] send_join_request(781) IPv4 ip:127.0.0.1 port:7001
> Jan 07 15:20:12   INFO [main] check_host_env(478) Allowed open files 1000000, suggested 6144000
> Jan 07 15:20:12   INFO [main] main(882) sheepdog daemon (version 0.7.50) started
> Jan 07 15:20:12   INFO [main] md_add_disk(311) /mnt/sheep/storage/2, vdisk nr 6, total disk 1
> Jan 07 15:20:12   INFO [main] send_join_request(781) IPv4 ip:127.0.0.1 port:7002
> Jan 07 15:20:12   INFO [main] check_host_env(478) Allowed open files 1000000, suggested 6144000
> Jan 07 15:20:12   INFO [main] main(882) sheepdog daemon (version 0.7.50) started
> Jan 07 15:20:12   INFO [main] md_add_disk(311) /mnt/sheep/storage/3, vdisk nr 6, total disk 1
> Jan 07 15:20:12   INFO [main] send_join_request(781) IPv4 ip:127.0.0.1 port:7003
> Jan 07 15:20:12   INFO [main] check_host_env(478) Allowed open files 1000000, suggested 6144000
> Jan 07 15:20:12   INFO [main] main(882) sheepdog daemon (version 0.7.50) started
> Jan 07 15:20:12   INFO [main] md_add_disk(311) /mnt/sheep/storage/4, vdisk nr 6, total disk 1
> Jan 07 15:20:12   INFO [main] send_join_request(781) IPv4 ip:127.0.0.1 port:7004
> Jan 07 15:20:12   INFO [main] check_host_env(478) Allowed open files 1000000, suggested 6144000
> Jan 07 15:20:12   INFO [main] main(882) sheepdog daemon (version 0.7.50) started


Marcin



More information about the sheepdog-users mailing list