[stgt] Startup race condition?
mangoo at wpkg.org
Mon Mar 29 00:41:13 CEST 2010
Am 28.03.2010 23:55, Dax Kelson wrote:
> I'm writing documentation and lab exercises on stgt for our Guru Labs
> Linux training courses. We try to document and show best practices in
> our courseware.
> I was looking at the initd.sample file and I see this code:
> # Start tgtd first.
> if [ "$RETVAL" -ne 0 ] ; then
> echo "Could not start tgtd (is tgtd already running?)"
> exit 1
> # Put tgtd into "offline" state until all the targets are configured.
> # We don't want initiators to (re)connect and fail the connection
> # if it's not ready.
> tgtadm --op update --mode sys --name State -v offline
> Is there a race condition between the initial starting of tgtd telling
> it to go into the offline state?
> If so, how about adding a --start-offline or equivalent to the tgtd
There is no race condition and "technically", tgtd starts in offline mode.
So, why all these commands?
When tgtd is started it has no targets configured. Any initiator
connecting will get a "not ready" response.
However, when as soon as the first target is configured, tgtd is ready
to process any queries from all initiators connecting to it. This means,
any other initiator connecting to tgtd will be told "no such target
here" and bad things will happen (for the initiator).
Consider this situation:
1. You want to update tgtd / kernel / reboot the server
2. tgtd has 20 initiators connected to multiple targets
3. tgtd is restarted and has no targets configured
4. 20 initiators keep reconnecting to tgtd - they are told to come back
soon, as target is not ready
5. First target is connected
6. One initiator establishes the connection, the other 19 initiators
have their connections failed (no such target here)
And now why the sample script suggests adding offline / ready, as you
4a. tgtd is set to be "offline"
5. We configure all targets; while we do this, initiators are being told
the target is not ready (initiators should retry connecting for some time)
6. As targets are configured, tgtd is set to "ready", initiators can
reconnect, the connections were not failed, although the target was not
present for some time
Note that by default, open-iscsi initiators will try to reconnect for
120 seconds before failing the connection, and this is how much time you
have to restart the target machine / tgtd - unless you change the
For other initiator implementations, it may differ.
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
More information about the stgt