[stgt] [PATCH] sg-based backing store

Tue Oct 14 18:30:00 CEST 2008

Sorry for a delay in response.

>> Where do you see the benchmark features that may
>> obscure the real-world performance?
>
> My point is that the performance of read-world workload benchmark like
> dbench are relevant for the users than the performances of sequential
> accesses.
>
Right now we are "fighting" the overhead incurred by tgt itself - command
processing time, user-kernel interactions, memcpys etc.
When we perform the measurements involving the null device, there is zero
penalty on random accesses. Thus performance improvements with
sequential access are indicative for any types of load.

When we work against a real networked device, then, of course, the workload
type matters a lot. But if we are lagging a few MB/s behind the local
performance with a FC device, i would guess that on any workload we'll
closely follow the disk performance with the same workload.
When the network becomes less of a bottleneck we should care more
about our own detrimental effects than to try to improve upon the i/o
device, i think. But i agree, it is worth taking comparative measurements
with a general workload generator when working with a real i/o device
like FC. I'll try to do it in the near future.

> Adding this feature is fine but I think there are still some issues if
> this is not just for performance measurements. For example, you need
> to take care of WCE. At least, you need to issue SYNCHRONIZE_CACHE
> when necessary.

This bs_sg implementation uses DIO at all times. I guess we don't have
to care about WCE because when we send a status to initiator, the data is
not merely written to cache (well, it is not), it has been actually sent to the
i/o device and acked by it.

I guess that the true meaning of SYNC_CACHE in such setting is to forward it
it to the back-end device  because its writeback cache, if present,
is the one that should be synchronized.

This actually means that we have to pass-thru other commands that report the
write-back caps, like MODE SENSE, as well.

So the actual "taking care" of this issue is to forward more commands to
the device, right?

This actually brings up the following topic...

>> I think there might be also a place for adding some features that you
>> have termed "passthrough", but this is another issue and i'll write
>> a separate mail on it :)
>
> FYI, 'pass through' is a common SCSI term.

I know, actually i meant that there is a whole range of possible
implementations, where a varying number of commands are forwarded
to the device. So it is not only about "100% passthrough" or
"zero-passthrough".

I think this distinction goes beyond the formal definition of the term
pass through, because, in practical terms, we have to carefully choose
the set of commands which are handled internally vs. those forwarded
to the device.

Just as an example, it seems that RESERVE/RELEASE cmds should always be
handled internally (at least because of the possibility that there are multiple
paths leading from the target machine to the I/O device and those paths
may act as different initiators from the device's perspective).

Right now, archtecture-wise, the decision about the way of handling a
command is made in the device-type code. The current device types are
the "natural" ones, like SBC, SSC etc. But what if a specific BS code used
in conjuction with a device type code requires handling of a different
set of commands?

If I'm satisfied with the SBC device type but want to forward 2 additional
commands to the BS, i need to define a new device type, to copy the entire
SBC code and to tweak its dispatch table. Not especially thrifty way of
handling this. Note also, that my decision to handle those 2 cmds in BS
did not make the SBC device-type code "less SBC", it only made a small
adjustment for the specific features of my BS implementation.

Perhaps, we should allow BS to report its capabilities or requirements
regarding some (or all) of the commands? Any other ideas about giving
BS finer control over the command dispatch table?

Alexander Nezhinsky
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html