[Stgt-devel] Tuning iSER for performance

Pete Wyckoff pw
Mon Mar 10 14:46:20 CET 2008


erezz at Voltaire.COM wrote on Mon, 10 Mar 2008 15:20 +0200:
> Pete Wyckoff wrote:
> > Agreed, that's rather slow, 480 MB/s.  Something else is going on.
> > Closest number I can lay my hands on says 350 kB was 94 us in the
> > pread, 3800 MB/s.
> 
> What's your setup? I'm using a RAM disk that I found here:
> 
> http://marc.info/?l=linux-scsi&m=120331663227540&w=2

Well that would be rather unusual.

Most of the world just does:

    mkdir /tmp/ramdisk
    mount -t tmpfs none /tmp/ramdisk
    dd if=/dev/zero bs=1M count=1024 of=/tmp/ramdisk/lun1
    tgtadm ... --backing-store /tmp/ramdisk/lun1

or similar.

> >   You should be measuring memory copy speed here.
> >   
> 
> Do you mean that memory copy is 480 MB/sec? That's slow.
> 
> >   
> >> Another question is - how does pread64 access the SCSI device? I
> >> understand that it reads from /dev/sdX. Does it call sd? How? Is there
> >> any memory copy involved? I'm asking that because I'm used to kernel
> >> space where we just call scsi_do_req.
> >>     
> >
> > It reads from wherever it put your device with ./tgtadm ...
> > --backing-store ... .  Presumably a file on the file system, or a
> > raw block device like /dev/sdb.
> 
> Of course. The question is - what is the interface between pread and
> scsi-ml? That's what I still don't understand.

You can start with the system call and follow it down: sys_pread64,
vfs_read, ..., ext3_readpage, ..., submit_bio, ... .  But I'm
talking tmpfs, which is slightly different.  And your experimental
scsi ram driver would export a block device but still goes
similarly through blkdev_readpage, submit_bh and on down.

You shouldn't really have to care.  This is internal plumbing that
better work.

> strace looks like that:
> 
> epoll_wait(3, {}, 1024, 2000) = 0 <2.000066>
> epoll_wait(3, {}, 1024, 2000) = 0 <2.000068>
> epoll_wait(3, {}, 1024, 2000) = 0 <2.000065>
> epoll_wait(3, {}, 1024, 2000) = 0 <2.000065>
> epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
> <1.184798>
> read(10, "\320\235R\0\0\0\0\0", 8) = 8 <0.000009>
> pread(11, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 131072, 0) = 131072 <0.000336>
> epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
> <0.000088>
> read(10, "\320\235R\0\0\0\0\0", 8) = 8 <0.000024>
> epoll_wait(3, {{EPOLLIN, {u32=5394880, u64=5394880}}}, 1024, 2000) = 1
> <0.000023>
> read(10, "\320\235R\0\0\0\0\0", 8) = 8 <0.000024>
> epoll_wait(3, {}, 1024, 2000) = 0 <1.998508>
> epoll_wait(3, {}, 1024, 2000) = 0 <1.999959>

All the time is in pread.  If a normal tmpfs fixes things, file a
bug report if you care about this scsi ram driver.  Probably better
if you test it without iscsi and iser to see if it is just
inherently slow.

		-- Pete



More information about the stgt mailing list