[Stgt-devel] yet another tgtd iSCSI misbehaviour (aborted journal, remounting ro)

FUJITA Tomonori tomof
Mon Feb 11 14:18:08 CET 2008


From: Tomasz Chmielewski <mangoo at wpkg.org>
Subject: Re: [Stgt-devel] yet another tgtd iSCSI misbehaviour (aborted journal, remounting ro)
Date: Mon, 11 Feb 2008 14:01:42 +0100

> FUJITA Tomonori schrieb:
> > On Wed, 06 Feb 2008 10:45:54 +0100
> > Tomasz Chmielewski <mangoo at wpkg.org> wrote:
> > 
> >> It seems there is yet another problem (?) in tgtd.
> >>
> >> It can be easily reproduced when the initiator crashes and then starts 
> >> again. I tested it only with diskless machines booted off iSCSI.
> >>
> >> To reproduce:
> >>
> >> 1. Start tgtd, apply settings with tgtadm
> >> 2. Start a diskless initiator:
> >>   a) a diskless initiator fetches the kernel and the initrd via PXE/tftp
> >>   b) kernel executes initrd; initrd brings the interface up
> >>   c) initrd starts the iSCSI connection with "iscsistart" command from 
> >> open-iscsi
> >>   d) we switch to a new root, system boots fine
> >>   e) IMPORTANT - system starts iscsid now (/etc/init.d/open-iscsi start)
> >>
> >> So far, everything was fine and unproblematic.
> >>
> >> 3. Now, crash your initiator machine (i.e. press reboot button)[1].
> >>
> >> 4. Initiator starts just fine again - the connection was established 
> >> with "iscsistart".
> >>
> >> 5. IMPORTANT - start iscsid now (/etc/init.d/open-iscsi start). The 
> >> initiator will report "connection1:0: iscsi: detected conn error (1011)" 
> >> and eventually, will break the connection, remount fs readonly etc. 
> >> scary things will happen.
> >>
> >>   a) there is a workaround to that: when initiator reports 
> >> "connection1:0: iscsi: detected conn error..." - kill tgtd, and start it 
> >> again. Initiator will reconnect flawlessly
> >>   b) if you don't kill/start tgtd again, connection will break and fs 
> >> will be remounted ro.
> >>
> >>
> >> The issue does not happen with IET or SCST.
> >>
> >> It looks like:
> >> - tgtd has an established connection with an initiator
> >> - initiator is killed, but tgtd still thinks initiator is connected
> >> to it
> > 
> > Did you confirm this? 'tgtadm --op show --mode target' shows you the
> > active initiators (and its connections).
> 
> Yes, as I remember, it was showing multiple initiators connected.

If so, the target has two independent sessions (the same IP address
doesn't matter). It should be ok.


> - when we start iscsid on the initiator, it confuses tgtd, tgtd
> breaks and has to be restarted

This isn't the case. Are you sure that tgtd started to close the
connection (I mean that the initiator might start to close the
connection)?

We need to know why the connection was closed. Can you perform
`tcpdump -w dump.cap -s 1600` and send dump.cap?



More information about the stgt mailing list