Alexander Nezhinsky wrote: > > > and tells us to ignore MaxRecvDataSegmentLength. But it doesn't say > > > how we should figure out the limit for data-type PDUs, i.e. for RDMA > > > transfers, or even if there should be one. The phrase "control > > > PDUs" means non-RDMA transfers. > > > > There are no "data-type" PDUs in ISER, that's why no limit for them is > > mentioned. Control type PDUs can carry unsolicited data, but that is > true > > only for write ops. As to RDMA ops, the initiator communicates the > size of > > transfer and registration key, allowing the target to do the > transfer, be it > > read or write, in on or several RDMA ops, as many as it likes. > > iSER spec is silent about the granularity of RDMA transfers > because it says nothing about the meaning of MaxBurstLength and > MaxRecvDataSegmentLength, when applied to the solicited data of > a write op, and to the entire data of a read op. > > On the other hand, it maps R2T PDUs to RDMA Reads, > and Data-IN to RDMA Writes (changing their meaning), but does not > require that their sizes must be governed by either > MaxBurstLength (for R2Ts) or MaxRecvDataSegmentLength > (for Data-INs). > > Thus we can interpret them freely, from the formal point of view. > Moreover, this does not contradict the spirit of the protocol, > which makes all RDMA transfers a target's responsibility. > > > > One approach would be to have the target RDMA the entire data > > > segment in a single operation. This approach minimizes the > > > overhead, but doesn't let us pipeline and may not be possible for > > > large transfer sizes. The OS won't let us pin all the memory > > > required for the transfer, perhaps. > > Another approach is to break both read and write RDMA transfers into > smaller units, allowing internal queuing, pipelining and efficient use > of memory. > > This means that the target should set for itself some internal values > of MaxBurstLength and Data-IN's MaxRecvDataSegmentLength. > These values will govern generation of R2Ts and Data-IN and these, > in turn, will initiate a series of RDMA transfers with the desired > granularity. > > > > Instead I've added another patch that changes the MaxRDSL in the > > > target to be whatever was negotiated for IRDSL. Since I see no way > > > in the spec how the target could send data in a control type PDU, > > > IRDSL wasn't doing anything for us anyway. And open-iscsi uses its > > > conn->max_recv_dlength as the starting point for IRDSL, which seems > > > reasonable. > > One example of target sending data in a control type PDU is a Response > PDU carrying sense data. Other types are Text-Responses outside Login > phase and some Task mgmt Responses (for higher error levels). > Anyway, the negotiated IRDSL value don't explicitly affect the target. > It just guarantees that the initiator is able to receive our PDUs. > > To summarize, the proposed approach uses the following policy: > > 1. If MaxRDSL declared by the other side is different from the negotiated > value of IRDSL, ignore it. > Hold your horses: 8.2 MaxRecvDataSegmentLength For an iSCSI connection belonging to a session in which RDMAExtensions=Yes was negotiated on the leading connection of the session, MaxRecvDataSegmentLength need not be declared in the Login Phase. We don't need to negotiate MaxRecvDataSegmentLength on an iSER connection. Actually, we establish a connection over iSER and start the login phase. If we were able to connect over iSER, we don't send keys that aren't relevant for iSER. Here's the code from login.c in open-iscsi: if (session->type == ISCSI_SESSION_TYPE_DISCOVERY || !session->t->template->rdma) { sprintf(value, "%d", conn->max_recv_dlength); if (!iscsi_add_text(pdu, data, max_data_length, "MaxRecvDataSegmentLength", value)) return 0; } else { sprintf(value, "%d", conn->max_recv_dlength); if (!iscsi_add_text(pdu, data, max_data_length, "InitiatorRecvDataSegmentLength", value)) return 0; sprintf(value, "%d", conn->max_xmit_dlength); if (!iscsi_add_text(pdu, data, max_data_length, "TargetRecvDataSegmentLength", value)) return 0; if (!iscsi_add_text(pdu, data, max_data_length, "RDMAExtensions", "Yes")) return 0; } Erez |