[Stgt-devel] close completes before synchronize cache response

Pete Wyckoff pw
Fri Aug 31 20:53:45 CEST 2007


michaelc at cs.wisc.edu wrote on Fri, 31 Aug 2007 12:49 -0500:
> Pete Wyckoff wrote:
> >Commit ae2f80cbc432fe8cc4da94bdf289a7d856f11ac2:
> >
> >    Date:   Sun Aug 12 08:18:45 2007 +0900
> >
> >    close a connection after sending a logout response
> >    
> >    For now we need to just close a connection after sending a logout
> >    response.
> >    
> >causes some initiator complaints with 2.6.22 on logout:
> >
> >sd 15:0:0:1: [sdb] Synchronizing SCSI cache
> >iscsi: cmd 0x35 is not queued (6)
> >iscsi: cmd 0x35 is not queued (6)
> >iscsi: cmd 0x35 is not queued (6)
> >sd 15:0:0:1: [sdb] Result: hostbyte=DID_NO_CONNECT 
> >driverbyte=DRIVER_OK,SUGGEST_OK
> >
> 
> I think this case is a initiator issue. As you said though there could 
> be a issue with the target too, but those errors above are from a 
> initiator goof up.
> 
> Before sending the logout we should have been waiting for the sync cache 
> commands to complete. There was a change in behavior in 2.6.21+'s 
> scsi-ml layer where in previous kernels removing the device from 
> userspace would wait for the sync cache to complete. The initiator would 
> then set some internal bits so new commands were not queued and then it 
> would send the logout command.
> 
> In 2.6.21, if you remove the device from userspace the sysfs operation 
> returns right away, so the cache sync could be queued in the block/scsi 
> layer, in the driver or it could be on the wire or it could be done. We 
> do not know. So for 2.6.21+ the initiator was doing
> 
> echo 1 > /sys/block/sdX/device/delete
> 
> Then setting the internal bits to stop new io from being queued to the 
> driver, and at this time the sync caches were finally getting queued to 
> the driver and the driver would fail them with the value you see above 
> because it thought they had already been sent and that we wanted to 
> shutdown.
> 
> This is fixed in the iscsi git trees.

Ah, it is all clear now that you point it out.  Of course the
initiatior should be waiting for the cache flush to complete on the
target.  I just assumed it was something in the target (or, more
likely, my patches on top of it).  Thanks for the explanation.

		-- Pete



More information about the stgt mailing list