[stgt] Startup race condition?

Tomasz Chmielewski mangoo at wpkg.org
Mon Mar 29 00:41:13 CEST 2010


Am 28.03.2010 23:55, Dax Kelson wrote:
> Hi,
>
> I'm writing documentation and lab exercises on stgt for our Guru Labs
> Linux training courses. We try to document and show best practices in
> our courseware.
>
> I was looking at the initd.sample file and I see this code:
>
>          # Start tgtd first.
>          tgtd&>/dev/null
>          RETVAL=$?
>          if [ "$RETVAL" -ne 0 ] ; then
>              echo "Could not start tgtd (is tgtd already running?)"
>              exit 1
>          fi
>          # Put tgtd into "offline" state until all the targets are configured.
>          # We don't want initiators to (re)connect and fail the connection
>          # if it's not ready.
>          tgtadm --op update --mode sys --name State -v offline
>
> Is there a race condition between the initial starting of tgtd telling
> it to go into the offline state?
>
> If so, how about adding a --start-offline or equivalent to the tgtd
> binary?

There is no race condition and "technically", tgtd starts in offline mode.

So, why all these commands?


When tgtd is started it has no targets configured. Any initiator 
connecting will get a "not ready" response.
However, when as soon as the first target is configured, tgtd is ready 
to process any queries from all initiators connecting to it. This means, 
any other initiator connecting to tgtd will be told "no such target 
here" and bad things will happen (for the initiator).


Consider this situation:

1. You want to update tgtd / kernel / reboot the server
2. tgtd has 20 initiators connected to multiple targets
3. tgtd is restarted and has no targets configured
4. 20 initiators keep reconnecting to tgtd - they are told to come back 
soon, as target is not ready
5. First target is connected
6. One initiator establishes the connection, the other 19 initiators 
have their connections failed (no such target here)


And now why the sample script suggests adding offline / ready, as you 
noticed:

4a. tgtd is set to be "offline"
5. We configure all targets; while we do this, initiators are being told 
the target is not ready (initiators should retry connecting for some time)
6. As targets are configured, tgtd is set to "ready", initiators can 
reconnect, the connections were not failed, although the target was not 
present for some time



Note that by default, open-iscsi initiators will try to reconnect for 
120 seconds before failing the connection, and this is how much time you 
have to restart the target machine / tgtd - unless you change the 
default values.

For other initiator implementations, it may differ.



-- 
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list