[stgt] Help pinpointing cause of tgtd instability

John Pletka jpletka at abraxis.com
Mon Dec 5 21:13:47 CET 2011


Would this be the signature of the task management bug?  The read-only
file system issue happened again today with
scsi-target-utils-1.0.22-1.el6.x86_64.  That was built from
http://kojipkgs.fedoraproject.org/packages/scsi-target-utils/1.0.22/1.fc17/src/scsi-target-utils-1.0.22-1.fc17.src.rpm.
 I'd be glad to test your patch, but I'll need some assistance
applying it since I don't have a working dev environment setup here.

Dec  5 12:51:17 san2 tgtd: abort_task_set(1114) found 14 0
Dec  5 12:51:17 san2 tgtd: abort_cmd(1090) found 14 6
Dec  5 12:51:29 san2 tgtd: abort_task_set(1114) found 40 0
Dec  5 12:51:29 san2 tgtd: abort_cmd(1090) found 40 6
Dec  5 12:51:44 san2 tgtd: conn_close(101) connection closed, 0x171c908 4
Dec  5 12:51:44 san2 tgtd: conn_close(107) sesson 0x1720d40 1
Dec  5 12:52:36 san2 tgtd: tgt_event_modify(225) Cannot find event 17
Dec  5 12:52:36 san2 tgtd: iscsi_event_modify(413) tgt_event_modify failed
Dec  5 12:54:19 san2 tgtd: abort_task_set(1114) found 10000025 0
Dec  5 12:54:19 san2 tgtd: abort_cmd(1090) found 10000025 6
Dec  5 12:57:14 san2 tgtd: abort_task_set(1114) found 1000001a 0
Dec  5 12:57:14 san2 tgtd: abort_cmd(1090) found 1000001a 6
Dec  5 12:57:29 san2 tgtd: conn_close(101) connection closed, 0x172cd38 20
Dec  5 12:57:29 san2 tgtd: conn_close(107) sesson 0x1726e40 1
Dec  5 12:58:41 san2 tgtd: abort_task_set(1114) found 2000005d 0
Dec  5 12:58:41 san2 tgtd: abort_cmd(1090) found 2000005d 6
Dec  5 12:58:57 san2 tgtd: conn_close(101) connection closed, 0x173b838 31
Dec  5 12:58:57 san2 tgtd: conn_close(107) sesson 0x173bb00 1
Dec  5 12:58:59 san2 tgtd: tgt_event_modify(225) Cannot find event 19
Dec  5 12:58:59 san2 tgtd: iscsi_event_modify(413) tgt_event_modify failed
Dec  5 12:58:59 san2 tgtd: tgt_event_modify(225) Cannot find event 20
Dec  5 12:58:59 san2 tgtd: iscsi_event_modify(413) tgt_event_modify failed
Dec  5 13:03:01 san2 tgtd: abort_task_set(1114) found 3000001b 0
Dec  5 13:03:01 san2 tgtd: abort_cmd(1090) found 3000001b 6
Dec  5 13:03:16 san2 tgtd: conn_close(101) connection closed, 0x17ca018 3
Dec  5 13:03:16 san2 tgtd: conn_close(107) sesson 0x17ca2e0 1
Dec  5 13:03:35 san2 tgtd: tgt_event_modify(225) Cannot find event 21



On Sat, Dec 3, 2011 at 11:18 PM, FUJITA Tomonori
<fujita.tomonori at lab.ntt.co.jp> wrote:
> On Thu, 1 Dec 2011 10:32:35 -0500
> John Pletka <jpletka at abraxis.com> wrote:
>
>> I have two NAS devices running an almost identical workload.  One of
>> them has been perfectly stable for over a year now.  On the other,
>> tgtd either aborts, or causes the iscsi mounted file systems to go
>> into read-only mode about once a week.  I wanted to lay out my
>> configuration to see if there is a most-likely cause.  One thing that
>> stands out is the scsi-target-utils version is 1.0.4 on the unstable
>> server, and 1.0.8 on the stable server.  yum update on CentOS 6 says
>> 1.0.4 is the most recent though and I see patches through Jan 17,
>> 2011.  Other potential causes -- bonded ethernet ports on the unstable
>> one, and no swap partition on the unstable one (the OS is installed on
>> a compact-flash card).
>>
>> From the abrt logs:
>> Process /usr/sbin/tgtd was killed by signal 11 (SIGSEGV)
>> Which <might> be related to this bug:
>> https://bugzilla.redhat.com/show_bug.cgi?id=712807
>
> I think that the above bug is not related with your problem. It's more
> likely that the problem is due to task management bugs. We've not
> figured out how the bugs happen.
>
> The following patch disables tmf. Try it to see if it works for you.
>
> diff --git a/usr/iscsi/iscsid.c b/usr/iscsi/iscsid.c
> index 3fbd9f6..6b814ba 100644
> --- a/usr/iscsi/iscsid.c
> +++ b/usr/iscsi/iscsid.c
> @@ -1405,6 +1405,9 @@ static int iscsi_tm_execute(struct iscsi_task *task)
>        struct iscsi_tm *req = (struct iscsi_tm *) &task->req;
>        int fn = 0, err = 0;
>
> +       err = ISCSI_TMF_RSP_REJECTED;
> +
> +#if 0
>        switch (req->flags & ISCSI_FLAG_TM_FUNC_MASK) {
>        case ISCSI_TM_FUNC_ABORT_TASK:
>                fn = ABORT_TASK;
> @@ -1432,7 +1435,7 @@ static int iscsi_tm_execute(struct iscsi_task *task)
>                eprintf("unknown task management function %d\n",
>                        req->flags & ISCSI_FLAG_TM_FUNC_MASK);
>        }
> -
> +#endif
>        if (err)
>                task->result = err;
>        else {
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list