[stgt] tgtd segfault with software RAID, hard resetting link

FUJITA Tomonori fujita.tomonori at lab.ntt.co.jp
Mon Apr 20 04:33:58 CEST 2009


Tomasz and Chris, very sorry for the delay,

On Wed, 15 Apr 2009 18:58:29 +0100
Chris Webb <chris at arachsys.com> wrote:

> Chris Webb <chris at arachsys.com> writes:
> 
> > Tomasz Chmielewski <mangoo at wpkg.org> writes:
> > 
> > > I tested your patch a bit and with it applied, I could not reproduce the  
> > > segfault any more.
> > >
> > > Which is good.
> > 
> > Great! I should stress though that it's not the correct fix, it's just a
> > and-aid for the problem of scsi_cmd structs being cleared up underneath
> > threads that are blocked on an IO operation, which shouldn't really be
> > happening in the first place.
> 
> PS Another excellent way to trigger it is by doing a very large buffered
> disk write on the target machine whilst tgtd has write-caching off. An
> unbuffered write from tgtd can't complete until the large buffered write has
> been flushed. If the time taken to do this is longer than 30s or so (which
> it easily can be given large RAM), tgtd will crash when the write returns
> after the target has already given up on the connection. I crashed tgtd
> several times this way today.

I've not tried to reproduce this problem yet, but can you please try
this patch?


diff --git a/usr/iscsi/conn.c b/usr/iscsi/conn.c
index 9829ba0..915f76f 100644
--- a/usr/iscsi/conn.c
+++ b/usr/iscsi/conn.c
@@ -166,9 +166,13 @@ void conn_close(struct iscsi_connection *conn)
 	conn->tx_task = NULL;
 
 	/* cleaning up commands waiting for SCSI_DATA_OUT */
-	while (!list_empty(&conn->task_list)) {
-		task = list_entry(conn->task_list.prev, struct iscsi_task,
-				  c_siblings);
+	list_for_each_entry_safe(task, tmp, &conn->task_list, c_siblings) {
+		/*
+		 * This task is in SCSI. We need to wait for I/O
+		 * completion.
+		 */
+		if (task_in_scsi(task))
+			continue;
 		iscsi_free_task(task);
 	}
 done:
diff --git a/usr/iscsi/iscsid.c b/usr/iscsi/iscsid.c
index bd2c982..24b6d23 100644
--- a/usr/iscsi/iscsid.c
+++ b/usr/iscsi/iscsid.c
@@ -1219,6 +1219,7 @@ static int iscsi_target_cmd_queue(struct iscsi_task *task)
 	memcpy(scmd->lun, task->req.lun, sizeof(scmd->lun));
 	scmd->attribute = cmd_attr(task);
 	scmd->tag = req->itt;
+	set_task_in_scsi(task);
 
 	return target_cmd_queue(conn->session->target->tid, scmd);
 }
diff --git a/usr/iscsi/iscsid.h b/usr/iscsi/iscsid.h
index 9f9a6f0..f90e0e9 100644
--- a/usr/iscsi/iscsid.h
+++ b/usr/iscsi/iscsid.h
@@ -244,12 +244,16 @@ struct iscsi_target {
 
 enum task_flags {
 	TASK_pending,
+	TASK_in_scsi,
 };
 
 #define set_task_pending(t)	((t)->flags |= (1 << TASK_pending))
 #define clear_task_pending(t)	((t)->flags &= ~(1 << TASK_pending))
 #define task_pending(t)		((t)->flags & (1 << TASK_pending))
 
+#define set_task_in_scsi(t)     ((t)->flags |= (1 << TASK_in_scsi))
+#define task_in_scsi(t)		((t)->flags & (1 << TASK_in_scsi))
+
 extern int lld_index;
 extern struct list_head iscsi_targets_list;
 
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list