erezz at Voltaire.COM wrote on Tue, 18 Dec 2007 14:11 +0200: > >>>> We ran some tests on it. Most of them are ok except for fsck. We ran it > >>>> in the following way: > >>>> > >>>> seed5:/tmp/regtest # parted -s /dev/sdb mkpart primary 0 8500 > >>>> seed5:/tmp/regtest # for ((i=1;i<=1000;i++)) do mkfs -t ext2 -q > >>>> /dev/sdb1; fsck -y -ft ext2 /dev/sdb1; echo iteration $i is done; done > >>>> > >>>> fsck is ok most of the time, but once in a while it looks like this > >>>> (after ~300 iterations): > >>>> > >>>> fsck 1.38 (30-Jun-2005) > >>>> e2fsck 1.38 (30-Jun-2005) > >>>> Pass 1: Checking inodes, blocks, and sizes > >>>> Pass 2: Checking directory structure > >>>> Pass 3: Checking directory connectivity > >>>> Pass 4: Checking reference counts > >>>> Pass 5: Checking group summary information > >>>> /dev/sdb1: 11/1038336 files (0.0% non-contiguous), 32599/2075195 blocks > >>>> seed5:/tmp/regtest # mkfs -t ext2 -q /dev/sdb1 > >>>> seed5:/tmp/regtest # fsck -y -ft ext2 /dev/sdb1 > >>>> > >>>> > >>> Sounds like data corruption. Do you see the same problem with IPoIB? > >>> > >> I'm working with Erez and I tried this with tcp session and there > >> weren't any problems. > > > > Thanks for confirming. So it's the iSER problem. > > > > I might break Pete's iSER code so can you revoke the latest three > > patches and try the same tests? > > > > > > rouen:~/git/tgt$ git-reset --hard HEAD~3 > > HEAD is now at 224ca81... iscsi: add iser support > > > > rouen:~/git/tgt$ git-log |head -5 > > commit 224ca81bca8dead8dd355d62422e11fe23f7bdc4 > > Author: Pete Wyckoff <pw at osc.edu> > > Date: Mon Dec 10 10:06:27 2007 -0500 > > Yes, I still see the same bad behavior with iSER. Pete & Robin - can you > try to run the same test (see above) with iSER and see if you get the > same behavior? I tried your exact script above with 2100 MB then 8500 MB as you did and could not get any corruption for 1000 iterations. Maybe my disk is to slow---internal ATA accessed via file in ext3. Likely some sort of iser issue, though there is an off-chance of a race in bs_sync or that neighborhood that only appears at high speeds. You were able to get lm_dd to break iser in the past. That was something I could repeat and fix. Any more failures there? Or if you can help figure out the nature of the corrpution: missing blocks or rearrangements, etc., that would definitely help. -- Pete |