[Stgt-devel] yet another tgtd iSCSI misbehaviour (aborted journal, remounting ro)

FUJITA Tomonori tomof
Mon Feb 11 13:58:19 CET 2008


On Wed, 06 Feb 2008 10:45:54 +0100
Tomasz Chmielewski <mangoo at wpkg.org> wrote:

> It seems there is yet another problem (?) in tgtd.
> 
> It can be easily reproduced when the initiator crashes and then starts 
> again. I tested it only with diskless machines booted off iSCSI.
> 
> To reproduce:
> 
> 1. Start tgtd, apply settings with tgtadm
> 2. Start a diskless initiator:
>   a) a diskless initiator fetches the kernel and the initrd via PXE/tftp
>   b) kernel executes initrd; initrd brings the interface up
>   c) initrd starts the iSCSI connection with "iscsistart" command from 
> open-iscsi
>   d) we switch to a new root, system boots fine
>   e) IMPORTANT - system starts iscsid now (/etc/init.d/open-iscsi start)
> 
> So far, everything was fine and unproblematic.
> 
> 3. Now, crash your initiator machine (i.e. press reboot button)[1].
> 
> 4. Initiator starts just fine again - the connection was established 
> with "iscsistart".
> 
> 5. IMPORTANT - start iscsid now (/etc/init.d/open-iscsi start). The 
> initiator will report "connection1:0: iscsi: detected conn error (1011)" 
> and eventually, will break the connection, remount fs readonly etc. 
> scary things will happen.
> 
>   a) there is a workaround to that: when initiator reports 
> "connection1:0: iscsi: detected conn error..." - kill tgtd, and start it 
> again. Initiator will reconnect flawlessly
>   b) if you don't kill/start tgtd again, connection will break and fs 
> will be remounted ro.
> 
> 
> The issue does not happen with IET or SCST.
> 
> It looks like:
> - tgtd has an established connection with an initiator
> - initiator is killed, but tgtd still thinks initiator is connected
> to it

Did you confirm this? 'tgtadm --op show --mode target' shows you the
active initiators (and its connections).


> - initiator connects from the same IP address
> - when we start iscsid on the initiator, it confuses tgtd, tgtd breaks 
> and has to be restarted
> 
> 
> Let me know if you need such tcpdumps (if so, please give me all tcpdump 
> command line options you would use):
> 
> - point 2e) - clean start of iscsid on the initiator
> - point 5) - iscsid start on the initiator when connection breaks
> - iscsid start on the initiator, target is SCST
> 
> 
> [1] I use kexec here to reboot the machine because it has a buggy BIOS 
> (an old Supermicro P4SBR/P4SBE server). Randomly, it doesn't reboot when 
> a normal reboot command is used; the system shuts down, but never 
> reboots. kexec is a nice workaround for that, but it doesn't close 
> network sockets, so the target thinks we're still connected.
> 
> 
> -- 
> Tomasz Chmielewski
> http://wpkg.org
> _______________________________________________
> Stgt-devel mailing list
> Stgt-devel at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/stgt-devel



More information about the stgt mailing list