[stgt] [PATCH] bs_sheepdog.c: fix io errors during writing to a snapshot

Ryusuke Konishi konishi.ryusuke at lab.ntt.co.jp
Tue Jan 7 13:59:31 CET 2014


The sheepdog driver fails with io errors when write access is
requested for a snapshot vdi.  The failure happens in create_branch
function:

 tgtd: read_write_object(684) No object found (oid: 8000000
 000000000, old_oid: 0)
 tgtd: create_branch(1160) reloading new inode object failed
 tgtd: bs_sheepdog_request(1197) creating writable VDI from
 snapshot failed

 sd 12:0:0:1: [sdb] Unhandled sense code
 sd 12:0:0:1: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
 sd 12:0:0:1: [sdb] Sense Key : Medium Error [current]
 sd 12:0:0:1: [sdb] Add. Sense: Unrecovered read error
 sd 12:0:0:1: [sdb] CDB: Write(10): 2a 00 00 00 20 a8 00 00 08 00
 Buffer I/O error on device sdb1, logical block 1041
 lost page write due to I/O error on sdb1

This turned out to be caused by a race condition among multiple write
requests.  When bs_sheepdog_request() receives a write request for the
snapshot vdi, it tries to change the snapshot to a writable vdi with
the create_branch function.  However, the current implementation of
create_branch() cannot handle concurrent requests exclusively nor
protected from regular io routine (sd_io).

This fixes the above io-error issue by serializing create_branch()
with a pthread reader/writer lock, and also fixes the race condition
between create_branch() and sd_io() with the lock.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke at lab.ntt.co.jp>
Cc: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
---
 usr/bs_sheepdog.c |   22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index 59b2d25..6d60bd6 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -1116,6 +1116,16 @@ static int create_branch(struct sheepdog_access_info *ai)
 	unsigned int wlen = 0, rlen;
 	int ret;
 
+	ret = pthread_rwlock_wrlock(&ai->inode_lock);
+	if (ret) {
+		eprintf("failed to get inode lock %s\n", strerror(ret));
+		return -1;
+	}
+
+	if (!ai->is_snapshot)
+		/* check again the snapshot flag to avoid race condition */
+		goto out;
+
 	memset(&hdr, 0, sizeof(hdr));
 	hdr.opcode = SD_OP_DEL_VDI;
 	hdr.vdi_id = ai->inode.vdi_id;
@@ -1128,7 +1138,7 @@ static int create_branch(struct sheepdog_access_info *ai)
 		     &wlen, &rlen);
 	if (ret) {
 		eprintf("deleting snapshot VDI for creating branch failed\n");
-		return -1;
+		goto out;
 	}
 
 	memset(&hdr, 0, sizeof(hdr));
@@ -1144,19 +1154,22 @@ static int create_branch(struct sheepdog_access_info *ai)
 		     &wlen, &rlen);
 	if (ret) {
 		eprintf("creating new VDI for creating branch failed\n");
-		return -1;
+		goto out;
 	}
 
 	ret = read_object(ai, (char *)&ai->inode, vid_to_vdi_oid(rsp->vdi_id),
 			  ai->inode.nr_copies, SD_INODE_SIZE, 0);
 	if (ret) {
 		eprintf("reloading new inode object failed");
-		return -1;
+		goto out;
 	}
 
+	ai->is_snapshot = 0;
 	dprintf("creating branch from snapshot, new VDI ID: %x\n", rsp->vdi_id);
+out:
+	pthread_rwlock_unlock(&ai->inode_lock);
 
-	return 0;
+	return ret;
 }
 
 static void bs_sheepdog_request(struct scsi_cmd *cmd)
@@ -1190,7 +1203,6 @@ static void bs_sheepdog_request(struct scsi_cmd *cmd)
 
 				break;
 			}
-			ai->is_snapshot = 0;
 		}
 
 		length = scsi_get_out_length(cmd);
-- 
1.7.9.3

--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list