[sheepdog-users] [master branch] SIGABRT when doing: dog vdi check
Liu Yuan
namei.unix at gmail.com
Wed Jan 8 07:21:01 CET 2014
On Tue, Jan 07, 2014 at 03:40:44PM +0100, Marcin Mirosław wrote:
> W dniu 07.01.2014 14:38, Liu Yuan pisze:
> > On Tue, Jan 07, 2014 at 01:29:40PM +0100, Marcin Mirosław wrote:
> >> W dniu 07.01.2014 12:50, Liu Yuan pisze:
> >>> On Tue, Jan 07, 2014 at 11:14:09AM +0100, Marcin Mirosław wrote:
> >>>> W dniu 07.01.2014 11:05, Liu Yuan pisze:
> >>>>> On Tue, Jan 07, 2014 at 10:51:18AM +0100, Marcin Mirosław wrote:
> >>>>>> W dniu 07.01.2014 03:00, Liu Yuan pisze:
> >>>>>>> On Mon, Jan 06, 2014 at 05:38:41PM +0100, Marcin Mirosław wrote:
> >>>>>>>> W dniu 2014-01-06 08:27, Liu Yuan pisze:
> >>>>>>>>> On Sat, Jan 04, 2014 at 04:13:27PM +0100, Marcin Mirosław wrote:
> >>>>>>>>>> W dniu 2014-01-04 06:28, Liu Yuan pisze:
> >>>>>>>>>>> On Fri, Jan 03, 2014 at 10:51:26PM +0100, Marcin Mirosław wrote:
> >>>>>>>>>>>> Hi!
> >>>>>>>>>>
> >>>>>>>>>> Hi all!
> >>>>>>>>>>
> >>>>>>>>>>>> I'm new on "sheep-run";) I'm starting to try sheepdog so probably
> >>>>>>>>>>>> I'm doing many things wrongly. I'm playing with sheepdog-0.7.6.
> >>>>>>>>>>>>
> >>>>>>>>>>>> First problem (SIGABRT): I started multi sheep daemeon on
> >>>>>>>>>>>> localhost: # for x in 0 1 2 3 4; do sheep -c local -j size=128M
> >>>>>>>>>>>> -p 700$x /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
> >>>>>>>>>>>>
> >>>>>>>>>>>> Next: # dog cluster info Cluster status: Waiting for cluster to
> >>>>>>>>>>>> be formatted
> >>>>>>>>>>>>
> >>>>>>>>>>>> # dog cluster format -c 2:1
> >>>>>>>>>>>
> >>>>>>>>>>> 0.7.6 doesn't support erasure code. Try latest master branch
> >>>>>>>>>>
> >>>>>>>>>> Now I'm on 486ace8ccbb [master]. How I should check choosen redundancy?
> >>>>>>>>>> # cat /mnt/test/vdi/list
> >>>>>>>>>> Name Id Size Used Shared Creation time VDI id
> >>>>>>>>>> Copies Tag
> >>>>>>>>>> testowy 0 1.0 GB 0.0 MB 0.0 MB 2014-01-04 15:07 cac836 3
> >>>>>>>>>>
> >>>>>>>>>> Here I can see 3 copies, can't see info about how many parity strips
> >>>>>>>>>> is configured. Probably this isn't implemented yet?
> >>>>>>>>>
> >>>>>>>>> Not yet. But currently you can 'dog cluster info -s' to see the global policy
> >>>>>>>>> scheme x:y (that you 'dog cluster format -c x:y').
> >>>>>>>>>
> >>>>>>>>> With erasure coding, 'copies' will have another meaning that the number of total
> >>>>>>>>> data + parity objects. In your case, it is 2+1=3. But as you said, this is
> >>>>>>>>> confusing, I think of adding a extra field to indicate redundancy scheme per vid.
> >>>>>>>>>
> >>>>>>>>> Well, for about issue, I can't reproduce it. Could you give me more envronment
> >>>>>>>>> information such as 32 or 64 bits of your OS? what is your distro?
> >>>>>>>>
> >>>>>>>> Hi!
> >>>>>>>> I'm using Gentoo 64bits, gcc version 4.7.3 (Gentoo Hardened 4.7.3-r1
> >>>>>>>> p1.4, pie-0.5.5), kernel 3.10 with Gentoo patches.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Does the problem still exist? I can't reproduce the issue yet. So how did you
> >>>>>>> reproduce it step by step?
> >>>>>>
> >>>>>> Hi!
> >>>>>> I'm installing sheepdog-0.7.x, next:
> >>>>>> # mkdir -p /mnt/sheep/{metadata,storage}
> >>>>>> # for x in 0 1 2 3 4; do sheep -c local -j size=128M -p 700$x
> >>>>>> /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
> >>>>>> # dog cluster format -c 2
> >>>>>> using backend plain store
> >>>>>> # dog vdi create testowy 5G
> >>>>>> # dog vdi check testowy
> >>>>>> PANIC: can't find next new idx
> >>>>>> dog exits unexpectedly (Aborted).
> >>>>>> dog() [0x4058da]
> >>>>>> [...]
> >>>>>>
> >>>>>> I'm getting SIGABRT on every try.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> On the same machine, with master branch(not stable-0.7), you mentioned you can't
> >>>>> reproduce the problem?
> >>>>
> >>>> With master branch (commit a79e69f9ad9c5) I'm getting such message:
> >>>> # dog vdi check testowy
> >>>> PANIC: can't find a valid vnode
> >>>> dog exits unexpectedly (Aborted).
> >>>> dog() [0x4057fa]
> >>>> /lib64/libpthread.so.0(+0xfd8f) [0x7f6d43cd0d8f]
> >>>> /lib64/libc.so.6(gsignal+0x38) [0x7f6d43951368]
> >>>> /lib64/libc.so.6(abort+0x147) [0x7f6d439526c7]
> >>>> dog() [0x40336e]
> >>>> dog() [0x409d9f]
> >>>> dog() [0x40cea5]
> >>>> dog() [0x403927]
> >>>> /lib64/libc.so.6(__libc_start_main+0xf4) [0x7f6d4393dc04]
> >>>> dog() [0x403c6c]
> >>>>
> >>>> Will be full gdb backtrace usefull?
> >>>
> >>> Hmm, before you run 'dog vdi check', what is output of 'dog cluster info',
> >>> 'dog node list', 'dog node md info --all'?
> >>
> >> Output using master branch:
> >> # dog cluster info
> >> Cluster status: running, auto-recovery enabled
> >>
> >> Cluster created at Tue Jan 7 13:21:53 2014
> >>
> >> Epoch Time Version
> >> 2014-01-07 13:21:54 1 [127.0.0.1:7000, 127.0.0.1:7001,
> >> 127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
> >>
> >> # dog node list
> >> Id Host:Port V-Nodes Zone
> >> 0 127.0.0.1:7000 128 16777343
> >> 1 127.0.0.1:7001 128 16777343
> >> 2 127.0.0.1:7002 128 16777343
> >> 3 127.0.0.1:7003 128 16777343
> >> 4 127.0.0.1:7004 128 16777343
> >>
> >> # dog node md info --all
> >> Id Size Used Avail Use% Path
> >> Node 0:
> >> 0 4.4 GB 4.0 MB 4.4 GB 0% /mnt/sheep/storage/0
> >> Node 1:
> >> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/1
> >> Node 2:
> >> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/2
> >> Node 3:
> >> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/3
> >> Node 4:
> >> 0 4.4 GB 0.0 MB 4.4 GB 0% /mnt/sheep/storage/4
> >>
> >
> > The very strange thing from your output is that only 1 copy was actually
> > written while you execute 'dog vdi create', but you formated the cluster with
> > two copy specified.
> >
> > You can verify this by
> >
> > ls /mnt/sheepdog/storage/*/
> >
> > I guess you can only see one object. Dunno why this happened.
>
> It is as you said:
> # ls /mnt/sheep/storage/*/
> /mnt/sheep/storage/0/:
> 80cac83600000000
>
> /mnt/sheep/storage/1/:
>
> /mnt/sheep/storage/2/:
>
> /mnt/sheep/storage/3/:
>
> /mnt/sheep/storage/4/:
>
>
> Now I'm on commit a79e69f9ad9c and problem still exists for me (in
> contrary to 0.7-stable). I noticed that in my /tmp appeared file
> "sheepdog_shm" and "lock" . Is it correct?
>
I suspect there is only actually one node in the cluster so 'vdi check' panic out.
before you run 'vdi check'
for i in `seq 0 5`;do dog cluster info -p 700$i;done
is every node output same?
for i in `seq 0 5`;do dog node list -p 700$i;done
same too?
Thanks
Yuan
More information about the sheepdog-users
mailing list