[Stgt-devel] yet another tgtd iSCSI misbehaviour (aborted journal, remounting ro)
FUJITA Tomonori
tomof
Mon Feb 11 13:58:19 CET 2008
On Wed, 06 Feb 2008 10:45:54 +0100
Tomasz Chmielewski <mangoo at wpkg.org> wrote:
> It seems there is yet another problem (?) in tgtd.
>
> It can be easily reproduced when the initiator crashes and then starts
> again. I tested it only with diskless machines booted off iSCSI.
>
> To reproduce:
>
> 1. Start tgtd, apply settings with tgtadm
> 2. Start a diskless initiator:
> a) a diskless initiator fetches the kernel and the initrd via PXE/tftp
> b) kernel executes initrd; initrd brings the interface up
> c) initrd starts the iSCSI connection with "iscsistart" command from
> open-iscsi
> d) we switch to a new root, system boots fine
> e) IMPORTANT - system starts iscsid now (/etc/init.d/open-iscsi start)
>
> So far, everything was fine and unproblematic.
>
> 3. Now, crash your initiator machine (i.e. press reboot button)[1].
>
> 4. Initiator starts just fine again - the connection was established
> with "iscsistart".
>
> 5. IMPORTANT - start iscsid now (/etc/init.d/open-iscsi start). The
> initiator will report "connection1:0: iscsi: detected conn error (1011)"
> and eventually, will break the connection, remount fs readonly etc.
> scary things will happen.
>
> a) there is a workaround to that: when initiator reports
> "connection1:0: iscsi: detected conn error..." - kill tgtd, and start it
> again. Initiator will reconnect flawlessly
> b) if you don't kill/start tgtd again, connection will break and fs
> will be remounted ro.
>
>
> The issue does not happen with IET or SCST.
>
> It looks like:
> - tgtd has an established connection with an initiator
> - initiator is killed, but tgtd still thinks initiator is connected
> to it
Did you confirm this? 'tgtadm --op show --mode target' shows you the
active initiators (and its connections).
> - initiator connects from the same IP address
> - when we start iscsid on the initiator, it confuses tgtd, tgtd breaks
> and has to be restarted
>
>
> Let me know if you need such tcpdumps (if so, please give me all tcpdump
> command line options you would use):
>
> - point 2e) - clean start of iscsid on the initiator
> - point 5) - iscsid start on the initiator when connection breaks
> - iscsid start on the initiator, target is SCST
>
>
> [1] I use kexec here to reboot the machine because it has a buggy BIOS
> (an old Supermicro P4SBR/P4SBE server). Randomly, it doesn't reboot when
> a normal reboot command is used; the system shuts down, but never
> reboots. kexec is a nice workaround for that, but it doesn't close
> network sockets, so the target thinks we're still connected.
>
>
> --
> Tomasz Chmielewski
> http://wpkg.org
> _______________________________________________
> Stgt-devel mailing list
> Stgt-devel at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/stgt-devel
More information about the stgt
mailing list