Thank you Or for the fresh ideas! Or Gerlitz schrieb: > 1. run with iscsi/tcp - for that end you should change the transport > name from "iser" to "tcp" in the open iscsi initiator "node" file - > which is either under /var/lib/iscsi/nodes (e.g > ./iqn.2008-12.com.voltair.tgt-ib-iser/10.10.0.90,3260,1/default) or > under /etc/iscsi/nodes - this will sort our the possibility that the > problem is not related to iser portion of the target and/or the > initiator Runs without problems over TCP over IB. We have this setting in production since a while. Please have a look on my recent postings in open-iscsi mailing list. > 2. you mentioned HCA firmware of 1.2.0 from which I understand your > initiator and/or target have the Mellanox Sinai HCA (25204). Yes, but I also tried on ConnectX HCAs wich reproduces the read errors perfectly. > Well, > there was a bug in the mthca driver which came into play only under > Sinai / iSER (initiator side) - It was fixed quite a while back in > > commit 608d8268be392444f825b4fc8fc7c8b509627129 > Author: Michael S. Tsirkin <mst at dev.mellanox.co.il> > Date: Mon Apr 16 17:04:55 2007 +0300 > > IB/mthca: Fix data corruption after FMR unmap on Sinai > > - make sure you have this fix in your initiator kernel. You mentioned > that you also tried 2.6.28 but I wasn't clear if it was the initiator > or target side. Also I wasn't clear if under this kernel you install > ofed or use the mainline bits. > I used 2.6.28 (mainline) on both target as well as initiator side. 2.6.26 behaves identically to 2.6.28. Please have a look at the first posting in this thread wich describes the whole software setup. Ofed 1.4 does not compile under debian (Lenny) completely. Namely the iser kernel module refuses to compile. I'm in contact with Guy Coates who is currently packaging the OFED-1.4 Stack for debian sid (http://www.beowulf.org/archive/2009-January/024313.html) to overcome the compiling issue. I will file this compiling issue to ofa-general. I am pretty sure that the mentioned fix on IB/mthca has found it's way into the mainline 2.6.26 kernel. The diff between mthca_mr.c in OFED-1.4 and in 2.6.26 is not that different and the position the data-corruption-fix addresses has nothing to to with actual code anymore. Please correct me if I am wrong. ares:/usr/src/linux-source-2.6.26/drivers/infiniband/hw/mthca# diff ~/mthca_mr.c.1.4 mthca_mr.c 31a32,33 > * > * $Id: mthca_mr.c 1349 2004-12-16 21:09:43Z roland $ 92,98c94,99 < for (o = order; o <= buddy->max_order; ++o) < if (buddy->num_free[o]) { < m = 1 << (buddy->max_order - o); < seg = find_first_bit(buddy->bits[o], m); < if (seg < m) < goto found; < } --- > for (o = order; o <= buddy->max_order; ++o) { > m = 1 << (buddy->max_order - o); > seg = find_first_bit(buddy->bits[o], m); > if (seg < m) > goto found; > } 105d105 < --buddy->num_free[o]; 111d110 < ++buddy->num_free[o]; 129d127 < --buddy->num_free[order]; 135d132 < ++buddy->num_free[order]; 149,151c146 < buddy->num_free = kzalloc((buddy->max_order + 1) * sizeof (int *), < GFP_KERNEL); < if (!buddy->bits || !buddy->num_free) --- > if (!buddy->bits) 164d158 < buddy->num_free[buddy->max_order] = 1; 172d165 < err_out: 174d166 < kfree(buddy->num_free); 175a168 > err_out: 187d179 < kfree(buddy->num_free); Best regards, Volker -- ==================================================== inqbus it-consulting +49 ( 341 ) 5643800 Dr. Volker Jaenisch http://www.inqbus.de Herloßsohnstr. 12 0 4 1 5 5 Leipzig N O T - F Ä L L E +49 ( 170 ) 3113748 ==================================================== -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html |