[sheepdog-users] [master branch] SIGABRT when doing: dog vdi check
Marcin Mirosław
marcin at mejor.pl
Wed Jan 8 09:47:51 CET 2014
W dniu 08.01.2014 07:21, Liu Yuan pisze:
> On Tue, Jan 07, 2014 at 03:40:44PM +0100, Marcin Mirosław wrote:
>> W dniu 07.01.2014 14:38, Liu Yuan pisze:
>>> On Tue, Jan 07, 2014 at 01:29:40PM +0100, Marcin Mirosław wrote:
>>>> W dniu 07.01.2014 12:50, Liu Yuan pisze:
>>>>> On Tue, Jan 07, 2014 at 11:14:09AM +0100, Marcin Mirosław wrote:
>>>>>> W dniu 07.01.2014 11:05, Liu Yuan pisze:
>>>>>>> On Tue, Jan 07, 2014 at 10:51:18AM +0100, Marcin Mirosław wrote:
>>>>>>>> W dniu 07.01.2014 03:00, Liu Yuan pisze:
>>>>>>>>> On Mon, Jan 06, 2014 at 05:38:41PM +0100, Marcin Mirosław wrote:
>>>>>>>>>> W dniu 2014-01-06 08:27, Liu Yuan pisze:
>>>>>>>>>>> On Sat, Jan 04, 2014 at 04:13:27PM +0100, Marcin Mirosław wrote:
>>>>>>>>>>>> W dniu 2014-01-04 06:28, Liu Yuan pisze:
>>>>>>>>>>>>> On Fri, Jan 03, 2014 at 10:51:26PM +0100, Marcin Mirosław wrote:
>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all!
>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm new on "sheep-run";) I'm starting to try sheepdog so probably
>>>>>>>>>>>>>> I'm doing many things wrongly. I'm playing with sheepdog-0.7.6.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> First problem (SIGABRT): I started multi sheep daemeon on
>>>>>>>>>>>>>> localhost: # for x in 0 1 2 3 4; do sheep -c local -j size=128M
>>>>>>>>>>>>>> -p 700$x /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Next: # dog cluster info Cluster status: Waiting for cluster to
>>>>>>>>>>>>>> be formatted
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # dog cluster format -c 2:1
>>>>>>>>>>>>>
>>>>>>>>>>>>> 0.7.6 doesn't support erasure code. Try latest master branch
>>>>>>>>>>>>
>>>>>>>>>>>> Now I'm on 486ace8ccbb [master]. How I should check choosen redundancy?
>>>>>>>>>>>> # cat /mnt/test/vdi/list
>>>>>>>>>>>> Name Id Size Used Shared Creation time VDI id
>>>>>>>>>>>> Copies Tag
>>>>>>>>>>>> testowy 0 1.0 GB 0.0 MB 0.0 MB 2014-01-04 15:07 cac836 3
>>>>>>>>>>>>
>>>>>>>>>>>> Here I can see 3 copies, can't see info about how many parity strips
>>>>>>>>>>>> is configured. Probably this isn't implemented yet?
>>>>>>>>>>>
>>>>>>>>>>> Not yet. But currently you can 'dog cluster info -s' to see the global policy
>>>>>>>>>>> scheme x:y (that you 'dog cluster format -c x:y').
>>>>>>>>>>>
>>>>>>>>>>> With erasure coding, 'copies' will have another meaning that the number of total
>>>>>>>>>>> data + parity objects. In your case, it is 2+1=3. But as you said, this is
>>>>>>>>>>> confusing, I think of adding a extra field to indicate redundancy scheme per vid.
>>>>>>>>>>>
>>>>>>>>>>> Well, for about issue, I can't reproduce it. Could you give me more envronment
>>>>>>>>>>> information such as 32 or 64 bits of your OS? what is your distro?
>>>>>>>>>>
>>>>>>>>>> Hi!
>>>>>>>>>> I'm using Gentoo 64bits, gcc version 4.7.3 (Gentoo Hardened 4.7.3-r1
>>>>>>>>>> p1.4, pie-0.5.5), kernel 3.10 with Gentoo patches.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Does the problem still exist? I can't reproduce the issue yet. So how did you
>>>>>>>>> reproduce it step by step?
>>>>>>>>
>>>>>>>> Hi!
>>>>>>>> I'm installing sheepdog-0.7.x, next:
>>>>>>>> # mkdir -p /mnt/sheep/{metadata,storage}
>>>>>>>> # for x in 0 1 2 3 4; do sheep -c local -j size=128M -p 700$x
>>>>>>>> /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
>>>>>>>> # dog cluster format -c 2
>>>>>>>> using backend plain store
>>>>>>>> # dog vdi create testowy 5G
>>>>>>>> # dog vdi check testowy
>>>>>>>> PANIC: can't find next new idx
>>>>>>>> dog exits unexpectedly (Aborted).
>>>>>>>> dog() [0x4058da]
>>>>>>>> [...]
>>>>>>>>
>>>>>>>> I'm getting SIGABRT on every try.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> On the same machine, with master branch(not stable-0.7), you mentioned you can't
>>>>>>> reproduce the problem?
>>>>>>
>>>>>> With master branch (commit a79e69f9ad9c5) I'm getting such message:
>>>>>> # dog vdi check testowy
>>>>>> PANIC: can't find a valid vnode
>>>>>> dog exits unexpectedly (Aborted).
>>>>>> dog() [0x4057fa]
>>>>>> /lib64/libpthread.so.0(+0xfd8f) [0x7f6d43cd0d8f]
>>>>>> /lib64/libc.so.6(gsignal+0x38) [0x7f6d43951368]
>>>>>> /lib64/libc.so.6(abort+0x147) [0x7f6d439526c7]
>>>>>> dog() [0x40336e]
>>>>>> dog() [0x409d9f]
>>>>>> dog() [0x40cea5]
>>>>>> dog() [0x403927]
>>>>>> /lib64/libc.so.6(__libc_start_main+0xf4) [0x7f6d4393dc04]
>>>>>> dog() [0x403c6c]
>>>>>>
>>>>>> Will be full gdb backtrace usefull?
>>>>>
>>>>> Hmm, before you run 'dog vdi check', what is output of 'dog cluster info',
>>>>> 'dog node list', 'dog node md info --all'?
>>>>
>>>> Output using master branch:
>>>> # dog cluster info
>>>> Cluster status: running, auto-recovery enabled
>>>>
>>>> Cluster created at Tue Jan 7 13:21:53 2014
>>>>
>>>> Epoch Time Version
>>>> 2014-01-07 13:21:54 1 [127.0.0.1:7000, 127.0.0.1:7001,
>>>> 127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
>>>>
>>>> # dog node list
>>>> Id Host:Port V-Nodes Zone
>>>> 0 127.0.0.1:7000 128 16777343
>>>> 1 127.0.0.1:7001 128 16777343
>>>> 2 127.0.0.1:7002 128 16777343
>>>> 3 127.0.0.1:7003 128 16777343
>>>> 4 127.0.0.1:7004 128 16777343
>>>>
>>>> # dog node md info --all
>>>> Id Size Used Avail Use% Path
>>>> Node 0:
>>>> 0 4.4 GB 4.0 MB 4.4 GB 0% /mnt/sheep/storage/0
>>>> Node 1:
>>>> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/1
>>>> Node 2:
>>>> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/2
>>>> Node 3:
>>>> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/3
>>>> Node 4:
>>>> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/4
>>>>
>>>
>>> The very strange thing from your output is that only 1 copy was actually
>>> written while you execute 'dog vdi create', but you formated the cluster with
>>> two copy specified.
>>>
>>> You can verify this by
>>>
>>> ls /mnt/sheepdog/storage/*/
>>>
>>> I guess you can only see one object. Dunno why this happened.
>>
>> It is as you said:
>> # ls /mnt/sheep/storage/*/
>> /mnt/sheep/storage/0/:
>> 80cac83600000000
>>
>> /mnt/sheep/storage/1/:
>>
>> /mnt/sheep/storage/2/:
>>
>> /mnt/sheep/storage/3/:
>>
>> /mnt/sheep/storage/4/:
>>
>>
>> Now I'm on commit a79e69f9ad9c and problem still exists for me (in
>> contrary to 0.7-stable). I noticed that in my /tmp appeared file
>> "sheepdog_shm" and "lock" . Is it correct?
>>
>
> I suspect there is only actually one node in the cluster so 'vdi check' panic out.
>
> before you run 'vdi check'
>
> for i in `seq 0 5`;do dog cluster info -p 700$i;done
>
> is every node output same?
>
>
> for i in `seq 0 5`;do dog node list -p 700$i;done
>
> same too?
Hi!
Output is looks as below:
# for i in `seq 0 4`;do dog cluster info -p 700$i;done
Cluster status: running, auto-recovery enabled
Cluster created at Wed Jan 8 09:42:40 2014
Epoch Time Version
2014-01-08 09:42:41 1 [127.0.0.1:7000, 127.0.0.1:7001,
127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
Cluster status: running, auto-recovery enabled
Cluster created at Wed Jan 8 09:42:40 2014
Epoch Time Version
2014-01-08 09:42:40 1 [127.0.0.1:7000, 127.0.0.1:7001,
127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
Cluster status: running, auto-recovery enabled
Cluster created at Wed Jan 8 09:42:40 2014
Epoch Time Version
2014-01-08 09:42:41 1 [127.0.0.1:7000, 127.0.0.1:7001,
127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
Cluster status: running, auto-recovery enabled
Cluster created at Wed Jan 8 09:42:40 2014
Epoch Time Version
2014-01-08 09:42:40 1 [127.0.0.1:7000, 127.0.0.1:7001,
127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
Cluster status: running, auto-recovery enabled
Cluster created at Wed Jan 8 09:42:40 2014
Epoch Time Version
2014-01-08 09:42:40 1 [127.0.0.1:7000, 127.0.0.1:7001,
127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
# for i in `seq 0 4`;do dog node list -p 700$i;done
Id Host:Port V-Nodes Zone
0 127.0.0.1:7000 128 16777343
1 127.0.0.1:7001 128 16777343
2 127.0.0.1:7002 128 16777343
3 127.0.0.1:7003 128 16777343
4 127.0.0.1:7004 128 16777343
Id Host:Port V-Nodes Zone
0 127.0.0.1:7000 128 16777343
1 127.0.0.1:7001 128 16777343
2 127.0.0.1:7002 128 16777343
3 127.0.0.1:7003 128 16777343
4 127.0.0.1:7004 128 16777343
Id Host:Port V-Nodes Zone
0 127.0.0.1:7000 128 16777343
1 127.0.0.1:7001 128 16777343
2 127.0.0.1:7002 128 16777343
3 127.0.0.1:7003 128 16777343
4 127.0.0.1:7004 128 16777343
Id Host:Port V-Nodes Zone
0 127.0.0.1:7000 128 16777343
1 127.0.0.1:7001 128 16777343
2 127.0.0.1:7002 128 16777343
3 127.0.0.1:7003 128 16777343
4 127.0.0.1:7004 128 16777343
Id Host:Port V-Nodes Zone
0 127.0.0.1:7000 128 16777343
1 127.0.0.1:7001 128 16777343
2 127.0.0.1:7002 128 16777343
3 127.0.0.1:7003 128 16777343
4 127.0.0.1:7004 128 16777343
Is it possible to put lock file in other dir? Should lock file has
different names for each sheep?
Marcin
More information about the sheepdog-users
mailing list