[Stgt-devel] More threads for device server

FUJITA Tomonori fujita.tomonori at lab.ntt.co.jp
Mon Sep 5 08:32:19 CEST 2005

From: Mike Christie <michaelc at cs.wisc.edu>
Subject: Re: [Stgt-devel] More threads for device server
Date: Sun, 04 Sep 2005 22:45:41 -0500

> >>>The current code uses work queue for performing SCSI commands (or
> >>>block target's tasks). Work queue is simple and good enough for
> >>>debugging, however, a single thread per CPU is not good enough (from
> >>>the performance perspective).
> >>>
> >>>I thought about creating multiple kernel threads by hand. Are there
> >>>handy APIs?
> >>
> >>you can create a single threaded workqueue per target or session?
> > 
> > 
> > The vfs APIs work synchronously. So we need multiple threads per
> > target (or session) to perform several SCSI commands simultaneously.
> > 
> > If we always use asynchronous block I/O APIs (like AIO vfs,
> > submit_bio, etc), a single threaded workqueue would be fine.
> I think async is going to be better in the long run since a thread per 
> device sounds like a lot. I am not familar with the AIO vfs code so I am 
> not much help and my opinion is really more of guess then. I am just 
> thinking if we can send more than one command down to the real device at 
> once then we could take advatage of the block layers io scheduling or 
> something.

Sorry, I should have stated this issue more precisely.

We need three delayed works: performing SCSI commands; notification
the completion of session creation and buffer allocation to target

Now we use the system default work queue (keventd). This is
insufficient if a user creates lots of targets, so I need a new work
queue (per target, session, or device).

I think a work queue per target is sufficient for notification
the completion. They are not high-performance stuff.

Targets need to perform multiple SCSI commands at the same time. If we
use synchronous APIs, the work queue framework is not sufficient
because it cannot perform lots of SCSI commands simultaneously. 

There are two options: we need to create lots of kernel threads
calling kthread_create several times (like IET) or we need to use
asynchronous APIs with the work queue framework. I don't think that
having lots of kernel threads is so bad, though it makes the stgt code
complicated and dirty a bit. So I'll try AIO later with work queue per
target. Work queue per device is not necessary, though it fit for
SAM-3 theoretically.

The last topic is that how to prevent a device from being removed (by
an user) while it has active I/O operations. There are many ways to do
that. I was just thought about things like the following.

stgt_device_destory() sets DEVICE_DEL bit (device->state). It prevents
queuecommand from putting new commands to the device. Then,
stgt_device_destory() calls flush_workqueue() to wait all commands to
finish and then frees the resources.

More information about the stgt mailing list