[Stgt-devel] project status

Thu Aug 3 17:20:08 CEST 2006

From: Tom Tucker <tom at opengridcomputing.com>
Subject: Re: [Stgt-devel] project status
Date: Thu, 03 Aug 2006 09:35:45 -0500

> [...snip...]
> > It turned out that we used netlink wrongly so we replaced netlink. We
> > need bi-directional, high-speed interface between user and kernel
> > space. Currently, there isn't such interface in mainline. So we use
> > mapped ring buffer between kernel and user spaces by using own
> > character device (I need to replace the very old way to create a
> > character device). If we find something better, we will replace the
> > current code (netdev guys seems to bring something promising for
> > network AIO).
> 
> I just looked at this code and am somewhat confused. I understand that
> the write handler in the kernel has the user mode process context and
> therefore understands any uspace pointer contained in the event. 
> 
> What puzzles me is that the write handler in the kernel doesn't use the
> passed in buffer, but instead takes the event from an mmaped ring
> buffer. It then passes the various fields in the event down to the
> handlers as parameters:
> 
> 	...
> 		err = scsi_tgt_kspace_exec(ev->u.cmd_rsp.host_no,
> 					   ev->u.cmd_rsp.cid,
> 					   ev->u.cmd_rsp.result,
> 					   ev->u.cmd_rsp.len,
> 					   ev->u.cmd_rsp.uaddr,
> 					   ev->u.cmd_rsp.rw);
> 	...
> At this point, you don't need 'ev' anymore (the contents have been
> copied into registers as parameters), so I don't understand the need for
> the ring buffer. The user space caller could just as well have passed
> you a pointer to a local variable in the write system call. Wouldn't
> this avoid the complexity of the ring buffer altogether?

I guess that there are some reasons.

1. We are still not sure what interface we will use in the future. We
might use a different interface. So I prefered less changes.

2. As you said, we need a process context due to bio_map_user. We use
a user-space single process so this might be bottleneck. Maybe we will
need to find a way to use workqueue (kernel threads) for I/O after
performance experiments.