[Stgt-devel] Tuning iSER for performance

Erez Zilber erezz
Tue Mar 11 14:05:05 CET 2008


Pete Wyckoff wrote:
> erezz at Voltaire.COM wrote on Mon, 10 Mar 2008 15:20 +0200:
>   
>> Pete Wyckoff wrote:
>>     
>>> Agreed, that's rather slow, 480 MB/s.  Something else is going on.
>>> Closest number I can lay my hands on says 350 kB was 94 us in the
>>> pread, 3800 MB/s.
>>>       
>> What's your setup? I'm using a RAM disk that I found here:
>>
>> http://marc.info/?l=linux-scsi&m=120331663227540&w=2
>>     
>
> Well that would be rather unusual.
>
> Most of the world just does:
>
>     mkdir /tmp/ramdisk
>     mount -t tmpfs none /tmp/ramdisk
>     dd if=/dev/zero bs=1M count=1024 of=/tmp/ramdisk/lun1
>     tgtadm ... --backing-store /tmp/ramdisk/lun1
>
> or similar.
>   

You always learn something new :-) . We didn't use RAM disks until now.

>   
>>>   You should be measuring memory copy speed here.
>>>   
>>>       
>> Do you mean that memory copy is 480 MB/sec? That's slow.
>>
>>     
>>>   
>>>       
>>>> Another question is - how does pread64 access the SCSI device? I
>>>> understand that it reads from /dev/sdX. Does it call sd? How? Is there
>>>> any memory copy involved? I'm asking that because I'm used to kernel
>>>> space where we just call scsi_do_req.
>>>>     
>>>>         
>>> It reads from wherever it put your device with ./tgtadm ...
>>> --backing-store ... .  Presumably a file on the file system, or a
>>> raw block device like /dev/sdb.
>>>       
>> Of course. The question is - what is the interface between pread and
>> scsi-ml? That's what I still don't understand.
>>     
>
> You can start with the system call and follow it down: sys_pread64,
> vfs_read, ..., ext3_readpage, ..., submit_bio, ... .  But I'm
> talking tmpfs, which is slightly different.  And your experimental
> scsi ram driver would export a block device but still goes
> similarly through blkdev_readpage, submit_bh and on down.
>
> You shouldn't really have to care.  This is internal plumbing that
> better work.
>
>   

OK. So, eventually (with a real storage) this will go down to the sd
driver and to scsi_mod. Is there any copy_from_user on the way or any
data copy?

>> strace looks like that:
>>
>> epoll_wait(3, {}, 1024, 2000) = 0 <2.000066>
>> epoll_wait(3, {}, 1024, 2000) = 0 <2.000068>
>> epoll_wait(3, {}, 1024, 2000) = 0 <2.000065>
>> epoll_wait(3, {}, 1024, 2000) = 0 <2.000065>
>> epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
>> <1.184798>
>> read(10, "\320\235R\0\0\0\0\0", 8) = 8 <0.000009>
>> pread(11, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>> 131072, 0) = 131072 <0.000336>
>> epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
>> <0.000088>
>> read(10, "\320\235R\0\0\0\0\0", 8) = 8 <0.000024>
>> epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
>> <0.000023>
>> read(10, "\320\235R\0\0\0\0\0", 8) = 8 <0.000024>
>> epoll_wait(3, {}, 1024, 2000) = 0 <1.998508>
>> epoll_wait(3, {}, 1024, 2000) = 0 <1.999959>
>>     
>
> All the time is in pread.  If a normal tmpfs fixes things, file a
> bug report if you care about this scsi ram driver.  Probably better
> if you test it without iscsi and iser to see if it is just
> inherently slow.
>   

Looks better now (with the RAM disk that you use):

epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
<0.000100>
read(10, "\320\235R\0\0\0\0\0", 8)      = 8 <0.000005>

Now, I also get nicer numbers with sgp_dd (bs=512, bpt=1024, thr=8,
time=1, count=102400000, dio=1):

READ - 1380 MB/sec
WRITE - 1420 MB/sec

with small IOs (1k):

READ - 40K
WRITE - 20K

I will try to put my hands on real fast storage and retest it.

Erez



More information about the stgt mailing list