[Stgt-devel] iSER
FUJITA Tomonori
fujita.tomonori
Thu Oct 11 07:57:42 CEST 2007
On Mon, 08 Oct 2007 17:36:16 +0200
Erez Zilber <erezz at Voltaire.COM> wrote:
> FUJITA Tomonori wrote:
> > On Thu, 4 Oct 2007 13:20:35 -0400
> > Pete Wyckoff <pw at osc.edu> wrote:
> >
> >
> >> pw at osc.edu wrote on Sun, 09 Sep 2007 14:12 -0400:
> >>
> >>> robin.humble+stgt at anu.edu.au wrote on Sun, 09 Sep 2007 11:30 -0400:
> >>>
> >>>> Summary:
> >>>> - 2.6.21 seems to be a good kernel. 2.6.22 or newer, or RedHat's OFED 1.2
> >>>> patched kernels all seem to have iSER bugs that make them unusable.
> >>>> - as everything works in 2.6.21 presumably this means there's nothing
> >>>> wrong with the iSER implementation in tgtd. well done! :)
> >>>>
> >>> Well, that's good and bad news. Nice to know that things do work at times,
> >>> but we have to figure out what happened in the initiator now. Or maybe tgt
> >>> is making some bad assumptions.
> >>>
> >> This all turned out to be a known bug in the mthca IB driver in
> >> kernels older than 2.6.21. Including the rhel5 kernel. The
> >> initiator uses FMR for memory registrations, and a certain popular
> >> chipset was prone to random scribbling on old registrations,
> >> yielding wrong data in the application or unexplainable kernel
> >> crashes. Nothing wrong in the target.
> >>
> >>
> >>>> with the 2.6.22.6 kernel and iSER I couldn't find any corruption
> >>>> issues using dd to /dev/sdc. however (as reported previously) if I put
> >>>> an ext3 filesystem on the iSER device and then dd to a file in the ext3
> >>>> filsystem then pretty much immediately I get:
> >>>> Sep 9 21:46:22 x11 kernel: EXT3-fs error (device sdc): ext3_new_block: Allocating block in system zone - blocks from 196611, length 1
> >>>> Sep 9 21:46:22 x11 kernel: EXT3-fs error (device sdc): ext3_new_block: Allocating block in system zone - blocks from 196612, length 1
> >>>> Sep 9 21:46:22 x11 kernel: EXT3-fs error (device sdc): ext3_new_block: Allocating block in system zone - blocks from 196613, length 1
> >>>> ...
> >>>>
> >>>> I get the same type of errors with 2.6.23-rc5 too.
> >>>>
> >>> I'm still not been able to reproduce this, at least on my
> >>> 2.6.22-rc5. One of these days we'll move to some newer kernels
> >>> here, but have been sort of waiting for the bidi approaches to
> >>> stabilize somewhat.
> >>>
> >> Maybe this is fixed. I did find one possible case where the Send
> >> result may have gone out before the final RDMA write, in the case
> >> when the target is starved for RDMA slots. But I never saw the
> >> problem myself, so can't say for sure.
> >>
> >> In fact, I hacked up the bs-sync code to calculate the result
> >> expected by the test application lmdd, rather than read it off disk,
> >> and could achieve your high throughputs but never any corruptions.
> >> It ran all night last night.
> >>
> >> Anyway, there's a new git out there with this one new patch and some
> >> kernel initiator warnings in the README.iser doc.
> >>
> >
> > Sounds promising. voltaire guys, any chance to try Pete's latest tree?
>
> We ran some tests on it and it looks ok now (still trying to make it
> crash :-) ). We will run more nasty tests soon, and if anything goes
> wrong, we will report. We will also try to get some performance numbers
> (BW, iops) from our storage.
Cool.
Pete, the iSER patchset is ready for re-submission?
BTW, can you elaborate on the following commit?
http://git.osc.edu/?p=tgt.git;a=commit;h=8d9eae7acd041fc10a7cfe560c1c280dcc290fa1
What type of commands hit this bug?
Thanks,
More information about the stgt
mailing list