[stgt] tgt V0.9.7 & V0.9.8 - getting tgtd segfault error 4

Martin Montreuil martin.montreuil at oracle.com
Thu Oct 1 03:00:49 CEST 2009


> Let me update you that we are past the issue in a number of ways.
>
> First, we upgraded to 0.9.9 Sunday night, found that we could run 36+ 
> hours, but then ran out of swap (32MB main memory + 8GB swap was 
> configured).  The swap is now sized at 46GB.  You may, under extreme 
> and prolonged load, have an issue but we are unlikely to be able to 
> recreate this scenario. 
>
> The root cause of our issue was a very large amount of I/O being 
> driven through the system due to an apparent malfunction in our own 
> software higher up the stack.   We are working around that issue and 
> consider tgt to be working fine with this most recent release.
>
> Bottom line - the segfault was resolved by the 0.9.9 release.
>
> Thanks!
>
> Marty
>
> FUJITA Tomonori wrote:
>> Very sorry for the late response,
>>
>> On Fri, 25 Sep 2009 07:17:31 -0700 (PDT)
>> Martin Montreuil <martin.montreuil at oracle.com> wrote:
>>
>>   
>>> Under V0.9.7 received:
>>> Sep 24 05:12:57 storageserver kernel: tgtd[31665]: segfault at 0000555e4ee57d90 rip 0000003dc14715a8 rsp 00007fff7f899ce0 error 4
>>>
>>> Upgraded to V0.9.8 and:
>>> Sep 25 01:37:02 storageserver kernel: tgtd[31609]: segfault at fffffffffffffff0 rip 0000000000405ae4 rsp 00007fffc7fe3940 error 4
>>>
>>> Not repeatable but happens generally within 24 hours.  There are 143 disks being served in this configuration and multipathd is in use. This is a new installation.  
>>>
>>> targets.conf looks like:
>>> <target iqn.2009-06.crrel:storageserver.disks>
>>> backing-store /dev/mapper/mpath1
>>> backing-store /dev/mapper/mpath2
>>> ...
>>> backing-store /dev/mapper/mpath143
>>> allow-in-use yes
>>> </target>
>>>
>>> Under V0.9.8 we are also seeing something new - BUG: soft lockup's (below) just prior to the segfault.  
>>>
>>> Any suggestions? 
>>>     
>>
>> Firstly, can you try the latest git tree?
>>
>> You hit a kernel bug (soft lockup) so I guess that an initiator drops
>> a connection then tgtd crashes.
>>
>> Seems that there might be a bug about cleaning up unfinished
>> tasks. I'll dig into the code later. I have several things to do
>> before the kernel summit so it might take some time. Sorry about that.
>> --
>> To unsubscribe from this list: send the line "unsubscribe stgt" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>   

--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list