[stgt] tgt V0.9.7 & V0.9.8 - getting tgtd segfault error 4
Martin Montreuil
martin.montreuil at oracle.com
Thu Oct 1 03:00:49 CEST 2009
> Let me update you that we are past the issue in a number of ways.
>
> First, we upgraded to 0.9.9 Sunday night, found that we could run 36+
> hours, but then ran out of swap (32MB main memory + 8GB swap was
> configured). The swap is now sized at 46GB. You may, under extreme
> and prolonged load, have an issue but we are unlikely to be able to
> recreate this scenario.
>
> The root cause of our issue was a very large amount of I/O being
> driven through the system due to an apparent malfunction in our own
> software higher up the stack. We are working around that issue and
> consider tgt to be working fine with this most recent release.
>
> Bottom line - the segfault was resolved by the 0.9.9 release.
>
> Thanks!
>
> Marty
>
> FUJITA Tomonori wrote:
>> Very sorry for the late response,
>>
>> On Fri, 25 Sep 2009 07:17:31 -0700 (PDT)
>> Martin Montreuil <martin.montreuil at oracle.com> wrote:
>>
>>
>>> Under V0.9.7 received:
>>> Sep 24 05:12:57 storageserver kernel: tgtd[31665]: segfault at 0000555e4ee57d90 rip 0000003dc14715a8 rsp 00007fff7f899ce0 error 4
>>>
>>> Upgraded to V0.9.8 and:
>>> Sep 25 01:37:02 storageserver kernel: tgtd[31609]: segfault at fffffffffffffff0 rip 0000000000405ae4 rsp 00007fffc7fe3940 error 4
>>>
>>> Not repeatable but happens generally within 24 hours. There are 143 disks being served in this configuration and multipathd is in use. This is a new installation.
>>>
>>> targets.conf looks like:
>>> <target iqn.2009-06.crrel:storageserver.disks>
>>> backing-store /dev/mapper/mpath1
>>> backing-store /dev/mapper/mpath2
>>> ...
>>> backing-store /dev/mapper/mpath143
>>> allow-in-use yes
>>> </target>
>>>
>>> Under V0.9.8 we are also seeing something new - BUG: soft lockup's (below) just prior to the segfault.
>>>
>>> Any suggestions?
>>>
>>
>> Firstly, can you try the latest git tree?
>>
>> You hit a kernel bug (soft lockup) so I guess that an initiator drops
>> a connection then tgtd crashes.
>>
>> Seems that there might be a bug about cleaning up unfinished
>> tasks. I'll dig into the code later. I have several things to do
>> before the kernel summit so it might take some time. Sorry about that.
>> --
>> To unsubscribe from this list: send the line "unsubscribe stgt" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
More information about the stgt
mailing list