[stgt] Max Sessions per Target
Bruno Condez
bcondez at riotgames.com
Thu Jan 24 04:42:56 CET 2013
Hi Ronnie,
The original iPXE session does stay ESTABLISHED even after OS kernel take
over and after OS shutdown. Netstat confirms.
When I shutdown the OS, the OS kernel connections go into FIN_WAIT (as
expected) and clear shortly after. But the original iPXE ones just hang
around ESTABLISHED like leaches.
Tgt shows the iPXE session active but with 0 Connections; while the OS
kernel session show with 1 connections.
So, tgt does realize some sessions have 0 connections (not sure if it's
tgt who detects this or iPXE who reports closing the connection (but not
the session)).
Though, would it maybe be possible to reap sessions that have 0 active
connections? Or maybe there are specific reasons why you don't do it?
TCP Keepalives although useful for cleanup, I'm not sure if think they
will work for my setup.
Here's my thoughts: During boot phase, iPXE creates a session, then a
connection and then presents the LUN to the computer, which will boot from
it.
A few seconds later, OS kernel takes over and establishes a new session.
At this point in time there are 2 sessions in the same target (OS with one
active connection and iPXE with 0 active connections). So I would still
need to use your patch of allowing only 2 sessions.
Now we assume TCP Keepalive is configured and eventually kicks in to
successfully clean the stale iPXE session. At this pint OS is booted,
running and only one iscsi session exists on the target (TCP Keepalive
just cleared out the stale iPXE).
Because I had to allow 2 sessions into the same target nothing prevents
another user from booting the same target. iPXE for that user would create
a session (which would succeed cause of the 2 limit) and the OS starts to
boot; granted the OS kernel iSCSI session will hang (cause it would trying
to establish a 3rd session) but at this point the OS *might* have already
written data to the LUN from the iPXE session.
I also noticed during my testing that booting the same LUN from different
computers crashes (segfault) tgtd; not always immediately but never more
than a couple of minutes.
Not saying this is specifically an issue; just mentioning more as FYI
(though you guys are probably aware of that).
If tgtd crashes all hell breaks loose. So, i ended up leveraging an iPXE
functionality of chainloading iSCSI boot from an http cgi script, to
control/prevent same target boot.
Basically, iPXE hits an http cgi script and sends it's iSCSI boot
information. That cgi script will then check via tgtadm if there are
already established connections to that target (by using --mode conn and
greping for Connections: 1 ). If there are active connections, boot from
that target is refused.
Basically, I just added an upper layer to tgt to control/prevent booting
the same target. Since I went this way, I ended up pimping the cgi script
to auto-provision targets&luns via tgtadm on-the-fly (and the disk volumes
in lvm) so I don't have to do that manually.
Having this sanity check (disallow booting the same lun) in tgt as an
option would still be preferred has tgt is the ultimate source for this;
tgt tracks all this info better than any script.
Personally, I do see value with the above has a last line of defense for
any misconfigurations (I can see sysadmins configuring new targets to boot
new servers then mistakenly fat-fingering something and suddenly a new
server boots a live LUN for another server).
For now, my cgi wrapper will do the trick.
I will still be available for testing if you do end-up implementing "1"
(NOPs). Or if you need me to collect information from my setup or
replicate the issue.
Lastly, I do appreciate the help, insight and suggestions you provided
Ronnie. They were most useful.
Cheers,
Bruno
On 1/23/13 7:47 AM, "ronnie sahlberg" <ronniesahlberg at gmail.com> wrote:
>Hi,
>
>So the patch did compile and work, that is one step towards solving
>your use case at least.
>
>
>So the original session from iPXE remains dormant in TGTD even after
>you reboot the client ?
>That sounds like the iPXE just abandons the iscsi session once the
>kernel takes over using its own
>kernel iscsi session without logging out or tearing down the tcp
>connection.
>
>Could you check on the target with
>netstat -tapn | grep 3260
>and see if you can see the iPXE connection remaining in ESTABLISHED
>state after the kernel has taken over?
>I guess you should have only one session in ESTABLISHED while iPXE
>boots, then two sessions once the kernel has taken over.
>Then shutdown the client and see if the iPXE tcp connection remains.
>
>
>You could also run wireshark on the target and see what the client is
>doing and what haoppens on the two sessions.
>
>
>If this is the case that the iPXE session remains, and even remains
>after you have rebooted the client, there are a few things that can be
>done.
>
>
>1, the proper way to detect and reap abandoned TCP connections in
>iscsi would be to send target initiatied NOPs.
>Unfortunately TGTD only support this for iSER but not for iSCSI.
>I might be able to try to add this over the upcoming weekend, maybe.
>But this is at best only a long term solution.
>
>For a short term solution I think you might have to try :
>2, TCP keepalives, TGTD supports TCP keepalives but all the values
>are hardcoded and pretty big.
>Try changing the values in usr/iscsi/iscsi_tcp.c to something more
>aggressive :
>
> opt = 60;
> ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &opt, sizeof(opt));
> if (ret)
> return ret;
>
> opt = 3;
> ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &opt, sizeof(opt));
> if (ret)
> return ret;
>
> opt = 30;
> ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &opt, sizeof(opt));
> if (ret)
> return ret;
>
>
>The values above should mean that an abandoned/dead session will be
>automatically torn down after 1.5 - 2 minutes (3*30 - 3+1 * 30)
>after the client is rebooted.
>
>TGTD should really add command line arguments to control/set the tcp
>keepalive values at some stage, but for now you have to tweak it in
>the sourcecode.
>
>
>See if you van get the TCP keepalives working. If you use wireshark,
>it should be able to detect what are TCP keepalives and flag them in
>the information column so they are easy to spot.
>(a tcp keeplalive is an unsolicited TCP ack segment with zero or one
>byte of data but which reverses the sequence number so that it is one
>less than the lowest valid value for the left edge of the tcp window,
>since it thus contains an invalid sequence number it triggers the
>other end to immediately send a tcp ack back with the correct value.
>since these invalid segments will result in the other end immediately
>responding back with an ACK this is used by TCP to detect when the
>other end has dissapeared.)
>
>
>regards
>ronnie sahlberg
>
>On Tue, Jan 22, 2013 at 2:50 PM, Bruno Condez <bcondez at riotgames.com>
>wrote:
>> Hi Ronnie,
>>
>> We'll be using this setup for gaming stations, where players will sit
>>down
>> in whatever PC they choose, boot their OS image (pre-built by us) and
>>play.
>> We do not know which computer each player is going to use. They might
>>also
>> use different computers throughout the day.
>> In the boot menu, players select their name and wait for their OS image
>>to
>> boot.
>> In the background the menu code sets an IQN (based on the player name)
>>and
>> boots from a pre-defined LUN.
>>
>> Because of this mobility, ACLs don't really work for us. Players are
>> supposed to be able to use whatever PC they want.
>> We could use CHAP auth but we want to minimize any player input during
>> boot phase and also does not really prevent someone from booting someone
>> else LUN (if they know their password).
>>
>> Your patch does compile and work as intended (many thanks for that!),
>> though i did found that such solution ends up not working as I intended.
>> The main reason for is because during initial bootstrap, iPXE creates a
>> session to present the LUN to the local hardware and boot form it. The
>>OS
>> will then take over iSCSI handling and a new session is created (it does
>> not reuse the session from iPXE). However, the original session from
>>iPXE
>> never gets removed from tgt (even though it reports 0 connections).
>>Should
>> be noted the client IQN and IP is exactly the same.
>> When shutting down the OS, it's session gets cleaned from tgt but the
>> original iPXE session still does not.
>>
>> Because of this, if we try to boot that same target again, it already
>>as 1
>> stale session and fails to boot cause it needs to create 2.
>>
>> So, tgt is aware the initial session from iPXE no longer has any
>> connections but does not close it's session.
>> Any idea why would this happen? My tgt setup uses the stock settings.
>>
>> Cheers,
>> Bruno
>>
>> On 1/21/13 9:30 PM, "ronnie sahlberg" <ronniesahlberg at gmail.com> wrote:
>>
>>>Hi,
>>>
>>>Why exactly can you not use ACLs? Maybe it is possible to tweak them
>>>so that they will work for your use-case ?
>>>
>>>
>>>
>>>Not tested at all, not even compile tested, but this might do what you
>>>want :
>>>(I doubt this kind of feature will go into mainline)
>>>
>>>{
>>> int cnt = 0;
>>> struct list_head *tmp;
>>> list_for_each(tmp, &target->it_nexus_list)
>>> cnt++;
>>> if (cnt > 1)
>>> return -EEXIST;
>>>}
>>>
>>>
>>>
>>>
>>>On Mon, Jan 21, 2013 at 8:12 PM, Bruno Condez <bcondez at riotgames.com>
>>>wrote:
>>>> Hi Everyone,
>>>>
>>>> I'm looking for what I hope to be a quick help from you stgt
>>>>developers.
>>>>
>>>> I run stgt 1.0.33 on a Centos 6.2 OS and i'm looking for a way to
>>>>limit
>>>> the number of sessions a target can have.
>>>>
>>>> A bit of background:
>>>> I have a bunch of client computers booting their OS through software
>>>>iSCSI
>>>> by leveraging iPXE.
>>>> I pxe boot iPXE which then presents a LUN to the computer as a local
>>>>disk
>>>> which then instructs such computer to boot from it.
>>>> These computers are used by different users who need their own OS
>>>> customized a specific way and bootable from different computers (same
>>>> hardware through).
>>>>
>>>> Each user as it's own target.
>>>> A boot menu exists that allows a user to choose his own target (OS
>>>>image)
>>>> from a list.
>>>> Now, the reason why I need to limit the number of sessions is to
>>>>prevent 2
>>>> users from booting the same LUN. Users can make mistakes and
>>>>accidentally
>>>> (or on purpose) boot someone else's LUNs.
>>>>
>>>> There are reasons why setting up CHAP authentication or initiator ACLs
>>>> will not work for this specific setup.
>>>>
>>>> Hence, limiting the sessions per targets is the desired effect that
>>>>works
>>>> for this setup.
>>>>
>>>> I have found a patch from 2008 that makes every target allow only a
>>>>single
>>>> session.
>>>> In particular, the code below does the trick:
>>>>
>>>> diff --git a/usr/target.c b/usr/target.c
>>>> index dc30c87..91085dc 100644
>>>> --- a/usr/target.c
>>>> +++ b/usr/target.c
>>>> @@ -248,6 +248,9 @@ int it_nexus_create(int tid, uint64_t itn_id, int
>>>> host_no, char *info)
>>>>
>>>> target = target_lookup(tid);
>>>>
>>>> + if (!list_empty(&target->it_nexus_list))
>>>> + return -EEXIST;
>>>> +
>>>> itn = zalloc(sizeof(*itn));
>>>> if (!itn)
>>>> return -ENOMEM;
>>>>
>>>>
>>>>
>>>> I've applied that patch to the current code (git'ed today) and it does
>>>> work as intended.
>>>>
>>>> However, I actually need it to allow a max of 2 sessions per target.
>>>> This is because, during initial boot, iPXE creates a session and
>>>>presents
>>>> the LUN to the computer.; which then boots from that LUN and during
>>>>boot
>>>> the OS detects it's on iSCSI and takes over iSCSI handling from
>>>> iPXE by establishing a new session to the same target.
>>>> When this happens, in tgtadm output I see two sessions from the same
>>>>IP,
>>>> though the original session shows 0 connections and the second
>>>>session 1
>>>> connection (the actual OS). But for tgt, there are still two sessions.
>>>>
>>>> So, would it be possible to have a patch similar to the above but that
>>>> allows a max of 2 sessions? Or a user configurable value?
>>>>
>>>> I realize this is a very specific request. I'm ok with this patch
>>>>being
>>>> ad-hoc, not officially supported and me having to deal with it on my
>>>>own.
>>>> I would do such patch myself but my knowledge of C is zero.
>>>>
>>>> I do appreciate any help in getting this specific request going.
>>>>
>>>> Cheers,
>>>> Bruno
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> Riot Games Ltd, Registered in Ireland No 483483. Registered Office 1st
>>>>Floor, Beaux Lane House, Lower Mercer Street, Dublin 2
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe stgt" in
>>>> the body of a message to majordomo at vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> ________________________________
>>
>> Riot Games Ltd, Registered in Ireland No 483483. Registered Office 1st
>>Floor, Beaux Lane House, Lower Mercer Street, Dublin 2
________________________________
Riot Games Ltd, Registered in Ireland No 483483. Registered Office 1st Floor, Beaux Lane House, Lower Mercer Street, Dublin 2
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
More information about the stgt
mailing list