[Stgt-devel] [PATCH 04/20] iser bidi alloc read buf

FUJITA Tomonori tomof
Sun Oct 28 06:46:20 CET 2007

On Sat, 27 Oct 2007 17:29:47 -0400
Pete Wyckoff <pw at osc.edu> wrote:

> tomof at acm.org wrote on Sat, 27 Oct 2007 23:56 +0900:
> > On Tue, 16 Oct 2007 11:19:03 -0400
> > Pete Wyckoff <pw at osc.edu> wrote:
> > 
> > > Allocate the read buffer for bidirectional commands in the transport to
> > > pass down to devices.  A device can fill and return this buffer in
> > > task->uaddr, and thus choose to do the read or write processing in any
> > 
> > Where do you free bidi buffer?
> The bs sets cmd->uaddr to be the data-in buffer (from initiator
> point of view) as it returns.  iscsi puts this in task->addr then
> later frees it if (task->addr != task->data).  Yeah, kind of
> complex.

Yeah, I really messed up the data buffer handling and bidi will make
it worse.

> Write command:  single allocation: task plus data-out all in one go.
>     bs is given task->data, does not change cmd->uaddr.  Single free
>     of task releases it.
> Read command:  single alloction:  task plus expected data-in size.
>     bs is given task->data, fills it, does not change cmd->uaddr.
>     Single free.
> (Inquiry etc look like read, and used to return a new cmd->uaddr,
>     requiring a second free.  But now use given task->data after
>     iser-transport-buf is applied.)
> Bidi command:  two allocations: task plus data-out, expected
>     data-in.  bs is given task->data as cmd->uaddr, and the
>     buffer for the data-in as cmd->bidi_uaddr.  It sets cmd->uaddr
>     to cmd->bidi_uaddr as it finishes, then iscsi frees that and
>     the task.
> Before this patch, bidi started out like write, but then the bs
> would malloc a new buffer for the read result and return that in
> cmd->uaddr.  iscsi_free_task() sees task->addr != task->data and
> frees that as well as task, just like old inquiry implementation.
> Change here is to make sure we can use a preallocated buffer from
> the transport if one is given.

Would scsi_data_buffer in scsi-ml to make the code more simple? I have
no problem with adding two scsi_data_buffer structures to scsi_cmnd.

struct scsi_cmnd {
       struct scsi_data_buffer data_in;
       struct scsi_data_buffer data_out;

And we add some accessors. iSCSI code should not access to them

Another thing that I'd like to is moving the buffer management from
iSCSI code as much as possible. We will have more software target
drivers like SRP and FCoE. They need bidi too and we need clean and
simple buffer management.

> > Please don't forget that bidi support isn't only for iSER. everyone
> > needs it.
> Yes, we got bidi working back in March 2007 and use it every day,
> without iSER.  Unfortunately the only bidi user is in our
> out-of-tree OSD code.  We're releasing the software OSD target
> library soon, then could add the backing store code to stgt,
> dependent on the presence of that library.

Great, I'm looking forward to OSD code.

> > The current buffer management is not clean so feel free to rewrite it
> > before iSER if you want.
> I would be willing to do this, but don't have a better suggestion
> yet.  Any ideas welcome.  Difficulty is supporting both bidi-aware
> and non-aware bs, and pre-malloc and non-pre-malloc transports.  All
> while minimizing the number of malloc operations per command.
> > > order.  Unfortunately, this allocation can not be combined with the task
> > > and task->data allocation as the bidi read size is not known until after
> > > AHS processing.
> > 
> > We need to change task->data allocation since we need aligned buffer
> > for DIO. The iSCSI driver uses page cache buf IET experience tells
> > me that we need to support DIO too.
> As in bs_xen?  Do you think the DIO alignment needs could fit into
> the framework described above?  Or does the BS somehow need to tell
> the transport what to allocate, to get alignment space?  That would
> definitely affect any proposed buffer mgmt rewrite.

Please forget bs_xen. We could remove it. :)

iSCSI seems to need DIO. It turned out that several people have
problems with IET default I/O scheme using page cache.

And yes, DIO needs aligment buffer. So as I wrote in another mail,
pre-allocated aligned buffer would work for DIO.

More information about the stgt mailing list