[stgt] tgtd list corruption bug

FUJITA Tomonori fujita.tomonori at lab.ntt.co.jp
Sun Jun 3 01:40:20 CEST 2012


On Wed, 23 May 2012 21:19:57 -0700
Andy Grover <agrover at redhat.com> wrote:

> Hello Tomo-san and all,
> 
> I've received another report of a list corruption bug causing tgtd to
> segfault from a customer, and this time with a core file. Here's a
> screenshot of the debugging session:
> 
> http://fedorapeople.org/~grover/tgtd-ddd-screenshot.png
> 
> Some observations:
> 
> 1) ptr to conn->session->cmd_list is in %rdx. Both the prev pointer and
> the next ptr (eventually) point to zeroed-out list entries, and leads to
> segfault.
> 
> 2) The system's log showed abort_task_set()s and conn_close()s, but none
> within hours of the segfault.
> 
> 3) Isn't this a *different* list than was seeing the corruption last
> time? iscsi_session.cmd_list instead of it_nexus->cmd_list? Strange.
> 
> Anyway I went over the code again and still don't see a bug, but maybe
> someone else might see something, or this list corruption might be
> easier to root-cause than the other one.

Unfortunately, neither I. I guess that we need to rewrite the task
management code.
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list