[Stgt-devel] [PATCH 03/20] iser transport buf

Sun Oct 28 06:46:19 CET 2007

On Sat, 27 Oct 2007 14:56:23 -0400
Pete Wyckoff <pw at osc.edu> wrote:

> tomof at acm.org wrote on Sat, 27 Oct 2007 23:56 +0900:
> > On Tue, 16 Oct 2007 11:18:57 -0400
> > Pete Wyckoff <pw at osc.edu> wrote:
> > 
> > > For RDMA, it is often nice to use data from a pool of pre-registered
> > > buffers.  To do this, the transport allocates memory for a response and
> > > passes it down to the devices to fill.  Some operations, though,
> > > allocate their own buffers and return that new memory instead.  These
> > > are usually small and the allocation is just done for convenience to
> > > avoid length bounds checking.  Copy the data into the provided transport
> > > buffer instead.
> > 
> > Do you really need pre-registered buffers for INQUERY, non I/O
> > commands?
> 
> To send the data, it must be in a registered buffer.  We preregister
> some for this purpose.  The other way is to dynamically register
> the buffer, then deregister it after tha data has been transferred.
> This adds lots of overhead, especially in the small IO case (like
> inquiry), and code complexity.

Sorry, I should have explained better. I meant to dynamic
registeration. BTW, I'm familiar with RDMA transfer since I had been
working on RDMA technology, lightweight protocols like VIA. And you
know, ibmvstgt uses RDMA.

What I asked is that we could use pre-registered buffers for I/Os
(read/write) and dynamic registered buffer for non I/O commands like
INQUIRY. As I said and (you said), it might lead to complexity but we
have sense buffer in scsi_cmnd so you need dynamic registration
anyway?

> Currently iscsi_alloc_task() uses malloc() to get a buffer big
> enough for the incoming command and any data that will be returned
> such as the inquiry result.  The approach in RDMA is to make sure
> that the allocation comes from the preregistered area.  TCP just
> uses malloc() as before.

As I wrote in another mail, tcp also might need pre-registered memory
mechanism for DIO.

> IO operations like in bs_sync will copy data into the cmd->uaddr
> (same as iscsi's task->data).  Some inquiry-like routines allocate a
> page-size buffer themselves with valloc() to avoid lots of little
> bounds checks, and return this instead of using the provided
> cmd->uaddr.  But this avoids our preregistered buffer space.
> 
> So the new approach in these inquiry-like routines is to continue to
> valloc() a page to hold the generated data and fill it, but then
> invoke a new helper "spc_return_buf()" that copies data from the
> valloc()-ed area into cmd->uaddr.
> 
> > Using pre-registerd buffers might make the code simpler than handling
> > both pre-registerd and normal buffers, but we already need to handle
> > something like that for mmapped I/Os.
> 
> Only bs_mmap handles cmd->mmaped.  It appears to be not for use with
> iscsi.
> 
> The modifications to inquiry only happen if cmd->uaddr is non-NULL.
> 
> > > Also fixes some leaks of these extra buffers in error paths, and cleans
> > > up unreachable code in ibmvio inquiry.
> > 
> > Can you send a separate patch to do that?
> 
> Split up into two prior patches.

Thanks, applied.