[stgt] stgtd 0.9.3 : Read-Errors using iser transport
Dr. Volker Jaenisch
volker.jaenisch at inqbus.de
Fri Feb 20 12:03:02 CET 2009
Hi Or!
I tracked it further down. The cofactor seems to be SMP.
When booting the server (8 cores) with maxcpus=1 the read error vanishes
completely
This is consistent with the findings of others
http://lists.berlios.de/pipermail/stgt-devel/2008-February/001367.html
in this case multiple simultaneous reads are the problem. Please note
also that in February 2008
the git commit "iscsi: improve iser scheduling" from Sept. 2008 has not
exsisted.
So maybe the iser improvement has make things worse but the original
error is older.
Or Gerlitz schrieb:
>> I found that subsequent reads on a small timescale often succeed. Also after
>> a pause of some couple of seconds reads probably succeed. THe timescale for
>> faillure lies inbetween. Please try a time distribution more randomly.
>> How big is your sample I used 1GB. When testing by hand 1 out of 6 reads fail.
>> Same behavior on stgt 0.8 and 0.9.0 .
>>
>
> I use 1GB as well. It would be helpful if you provide me with a script
> that does these random timings between reads. Also, I noted that after
> one read, no I/O is going any more on the target side, as this 1GB
> probably gets cached. My backing store is an sdb block device and I
> wasn't sure what's your and if you have caching at all, maybe this
> influences something.
>
As test target I use a LVM2 partition of 100GB size.
athene:~/tgt-0.9.0/usr# lvdisplay
--- Logical volume ---
LV Name /dev/vg0/test
VG Name vg0
LV UUID 85CXvj-YLoE-DrRt-mAVu-Ujaz-cLW4-RwQveP
LV Write Access read/write
LV Status available
# open 1
LV Size 100,00 GB
Current LE 25600
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:3
This sequence is used to start the target
athene:~/tgt-0.9.0/usr# ./tgtd
athene:~/tgt-0.9.0/usr# tgtadm --lld iscsi --op new --mode target --tid
1 -T de.inqbus.athene:test
athene:~/tgt-0.9.0/usr# tgtadm --lld iscsi --op new --mode logicalunit
--tid 1 --lun 1 -b /dev/vg0/test
athene:~/tgt-0.9.0/usr# tgtadm --lld iscsi --op bind --mode target --tid
1 -I 10.1.3.0/24
athene:~/tgt-0.9.0/usr# ./tgtadm --m target --op show
Target 1: de.inqbus.athene:test
System information:
Driver: iscsi
State: ready
I_T nexus information:
I_T nexus: 4
Initiator: iqn.1993-08.org.debian:01:cb2c5d33d1f8
Connection: 0
RDMA IP Address: 10.1.3.33
LUN information:
LUN: 0
Type: controller
SCSI ID: deadbeaf1:0
SCSI SN: beaf10
Size: 0 MB
Online: Yes
Removable media: No
Backing store: No backing store
LUN: 1
Type: disk
SCSI ID: deadbeaf1:1
SCSI SN: beaf11
Size: 107374 MB
Online: Yes
Removable media: No
Backing store: /dev/vg0/test
Account information:
ACL information:
10.1.3.0/24
On initiator I do
ares:~# iscsiadm -m session
iscsiadm: No active sessions.
ares:~# iscsi_discovery 10.1.3.32 -tiser -l
iscsiadm: No active sessions.
Set target de.inqbus.athene:test to automatic login over iser to portal
10.1.3.32:3260
discovered 1 targets at 10.1.3.32
The testscript:
import random
import os
import time
#writing
writeCmd = 'lmdd if=internal of=/dev/sdc opat=1 bs=1M count=1000'
readCmd = 'lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10'
def tee( cmd ):
print cmd
os.system( cmd )
tee( writeCmd )
while True:
# sleepTime = random.randrange(0,10)
sleepTime = 1
print 'sleeping %s seconds ..' % sleepTime
time.sleep( sleepTime )
tee( readCmd )
Output with maxcpus=1 mem=256M (to prevent caching):
ares:~# python rndTest.py
lmdd if=internal of=/dev/sdc opat=1 bs=1M count=1000
1000.0000 MB in 2.1589 secs, 463.2016 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7827 secs, 359.3686 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8148 secs, 355.2689 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7981 secs, 357.3900 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7960 secs, 357.6489 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7885 secs, 358.6165 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8016 secs, 356.9349 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8066 secs, 356.3076 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8142 secs, 355.3358 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8174 secs, 354.9415 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8349 secs, 352.7521 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8078 secs, 356.1559 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8028 secs, 356.7876 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7893 secs, 358.5088 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7898 secs, 358.4447 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7887 secs, 358.5943 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7846 secs, 359.1125 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7951 secs, 357.7686 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7817 secs, 359.4879 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.8016 secs, 356.9339 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7750 secs, 360.3620 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.7953 secs, 357.7491 MB/sec
sleeping 1 seconds ..
This seems to be ok.
Now with 2 cores maxcpus=2 mem=256M
ares:~# python rndTest.py
lmdd if=internal of=/dev/sdc opat=1 bs=1M count=1000
1000.0000 MB in 2.3562 secs, 424.4106 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
1000.0000 MB in 2.2888 secs, 436.9008 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
off=119000000 want=71ff000 got=721f000
off=119000000 want=71ff004 got=721f004
off=119000000 want=71ff008 got=721f008
off=119000000 want=71ff00c got=721f00c
off=119000000 want=71ff010 got=721f010
off=119000000 want=71ff014 got=721f014
off=119000000 want=71ff018 got=721f018
off=119000000 want=71ff01c got=721f01c
off=119000000 want=71ff020 got=721f020
off=119000000 want=71ff024 got=721f024
119.0000 MB in 0.2676 secs, 444.6105 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
off=309000000 want=1273d600 got=1275d600
off=309000000 want=1273d604 got=1275d604
off=309000000 want=1273d608 got=1275d608
off=309000000 want=1273d60c got=1275d60c
off=309000000 want=1273d610 got=1275d610
off=309000000 want=1273d614 got=1275d614
off=309000000 want=1273d618 got=1275d618
off=309000000 want=1273d61c got=1275d61c
off=309000000 want=1273d620 got=1275d620
off=309000000 want=1273d624 got=1275d624
309.0000 MB in 0.7249 secs, 426.2704 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
off=126000000 want=7900000 got=7901000
off=126000000 want=7900004 got=7901004
off=126000000 want=7900008 got=7901008
off=126000000 want=790000c got=790100c
off=126000000 want=7900010 got=7901010
off=126000000 want=7900014 got=7901014
off=126000000 want=7900018 got=7901018
off=126000000 want=790001c got=790101c
off=126000000 want=7900020 got=7901020
off=126000000 want=7900024 got=7901024
126.0000 MB in 0.2968 secs, 424.4868 MB/sec
sleeping 1 seconds ..
lmdd of=internal if=/dev/sdc ipat=1 bs=1M count=1000 mismatch=10
off=80000000 want=4cdec00 got=4cfec00
off=80000000 want=4cdec04 got=4cfec04
off=80000000 want=4cdec08 got=4cfec08
off=80000000 want=4cdec0c got=4cfec0c
off=80000000 want=4cdec10 got=4cfec10
off=80000000 want=4cdec14 got=4cfec14
off=80000000 want=4cdec18 got=4cfec18
off=80000000 want=4cdec1c got=4cfec1c
off=80000000 want=4cdec20 got=4cfec20
off=80000000 want=4cdec24 got=4cfec24
80.0000 MB in 0.1790 secs, 447.0273 MB/sec
This behavior is completely reproducable. My guess is that the AMD
hyper-transport
may interfere with the fmr. But I am no linux memory management
specialist .. so please correct me if I am wrong.
Maybe the following happens:
Bootet with one CPU all FMR request goes to the 16GB RAM this single CPU
directly addresses via its memory controller.
In case of more than one active CPU the memory is fetched from both CPUs
memory controllers
with preference to local memory. In seldom cases the memory manager
fetchs memory for the FMR process running on CPU0 from the CPU1 via the
hyper-transport channel and something weird happens.
This is shear gussing around I have no hard facts for this.
Cheers
Volker
--
====================================================
inqbus it-consulting +49 ( 341 ) 5643800
Dr. Volker Jaenisch http://www.inqbus.de
Herloßsohnstr. 12 0 4 1 5 5 Leipzig
N O T - F Ä L L E +49 ( 170 ) 3113748
====================================================
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
More information about the stgt
mailing list