[stgt] tgtd list corruption bug

Andy Grover agrover at redhat.com
Thu May 24 06:19:57 CEST 2012

Hello Tomo-san and all,

I've received another report of a list corruption bug causing tgtd to
segfault from a customer, and this time with a core file. Here's a
screenshot of the debugging session:


Some observations:

1) ptr to conn->session->cmd_list is in %rdx. Both the prev pointer and
the next ptr (eventually) point to zeroed-out list entries, and leads to

2) The system's log showed abort_task_set()s and conn_close()s, but none
within hours of the segfault.

3) Isn't this a *different* list than was seeing the corruption last
time? iscsi_session.cmd_list instead of it_nexus->cmd_list? Strange.

Anyway I went over the code again and still don't see a bug, but maybe
someone else might see something, or this list corruption might be
easier to root-cause than the other one.

Regards -- Andy
