[Stgt-devel] project status

Thu Aug 3 19:02:28 CEST 2006

From: FUJITA Tomonori <fujita.tomonori at lab.ntt.co.jp>
Subject: Re: [Stgt-devel] project status
Date: Fri, 04 Aug 2006 00:20:08 +0900

> From: Tom Tucker <tom at opengridcomputing.com>
> Subject: Re: [Stgt-devel] project status
> Date: Thu, 03 Aug 2006 09:35:45 -0500
> 
> > [...snip...]
> > > It turned out that we used netlink wrongly so we replaced netlink. We
> > > need bi-directional, high-speed interface between user and kernel
> > > space. Currently, there isn't such interface in mainline. So we use
> > > mapped ring buffer between kernel and user spaces by using own
> > > character device (I need to replace the very old way to create a
> > > character device). If we find something better, we will replace the
> > > current code (netdev guys seems to bring something promising for
> > > network AIO).
> > 
> > I just looked at this code and am somewhat confused. I understand that
> > the write handler in the kernel has the user mode process context and
> > therefore understands any uspace pointer contained in the event. 
> > 
> > What puzzles me is that the write handler in the kernel doesn't use the
> > passed in buffer, but instead takes the event from an mmaped ring
> > buffer. It then passes the various fields in the event down to the
> > handlers as parameters:
> > 
> > 	...
> > 		err = scsi_tgt_kspace_exec(ev->u.cmd_rsp.host_no,
> > 					   ev->u.cmd_rsp.cid,
> > 					   ev->u.cmd_rsp.result,
> > 					   ev->u.cmd_rsp.len,
> > 					   ev->u.cmd_rsp.uaddr,
> > 					   ev->u.cmd_rsp.rw);
> > 	...
> > At this point, you don't need 'ev' anymore (the contents have been
> > copied into registers as parameters), so I don't understand the need for
> > the ring buffer. The user space caller could just as well have passed
> > you a pointer to a local variable in the write system call. Wouldn't
> > this avoid the complexity of the ring buffer altogether?
> 
> I guess that there are some reasons.
> 
> 1. We are still not sure what interface we will use in the future. We
> might use a different interface. So I prefered less changes.

This might not be clear.

What I wanted to say is that we prefer a simple and generic approach
(so that we might replace it with a possible future generic
interface). So we chose to use ring buffer in both directions. But we
are happy to move to a new interface any time as long as it is
accepted into mainline.

There is one issue related with ring buffer. Now we use a single ring
and if the ring is full (that is, the user-space daemon is too slow),
we return an error to LLDs and expect LLDs to throw away SCSI
commands. In this case, an initiator sends TMF, then if everything
goes well, we can be back to normal. However, queueing some commands
or resizing rings (at least up to the number that transport layers
accpet) would be nice. And using ring per target entity might be an
interesting approach worth considering.

> 2. As you said, we need a process context due to bio_map_user. We use
> a user-space single process so this might be bottleneck. Maybe we will
> need to find a way to use workqueue (kernel threads) for I/O after
> performance experiments.
> _______________________________________________
> Stgt-devel mailing list
> Stgt-devel at lists.berlios.de
> http://bat.berlios.de/mailman/listinfo/stgt-devel