[sheepdog] [PATCH v2] object cache: fix a race problem
Liu Yuan
namei.unix at gmail.com
Sun Jun 3 14:13:16 CEST 2012
From: Liu Yuan <tailai.ly at taobao.com>
Fix the following problem:
...
Jun 03 18:39:53 do_local_io(52) 2, ac1a3e000019b7 , 1
Jun 03 18:39:53 object_cache_pull(529) oid ac1a3e000019b7pulled successfully
Jun 03 18:39:53 object_cache_pull(529) oid ac1a3e000019b7pulled successfully
Jun 03 18:39:53 create_cache_object(451) 000019b7 already created
Jun 03 18:39:53 object_cache_rw(415) 000019b7, len 4096, off 1048576
Jun 03 18:39:53 read_cache_object(396) size 0, count:4096, offset 1048576 File exists
Jun 03 18:39:53 do_gateway_request(308) failed: 2, ac1a3e000019b7 , 1, 3
Jun 03 18:39:53 gateway_op_done(151) leaving sheepdog cluster
...
The problem is, suppose we have two cloned VM reads the same COW oid:
A B
object_cache_pull() { object_cache_pull() {
create_cache_object() { create_cache_object() {
open(oid);
open(oid) {
oid_already_opened() {
goto out;
}
}
}
}
read_cache_object() {
read_size != requested_length;
return EIO;
}
wirte(oid);
}
}
The fix looks more a workaround, I will happy to see a better fix.
Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
---
sheep/object_cache.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/sheep/object_cache.c b/sheep/object_cache.c
index e091930..0a055b3 100644
--- a/sheep/object_cache.c
+++ b/sheep/object_cache.c
@@ -448,9 +448,25 @@ static int create_cache_object(struct object_cache *oc, uint32_t idx, void *buff
fd = open(buf.buf, flags, def_fmode);
if (fd < 0) {
if (errno == EEXIST) {
+ struct stat st;
+ fd = open(buf.buf, def_open_flags);
+ if (fd < 0) {
+ eprintf("%m\n");
+ goto out;
+ }
+ if (fstat(fd, &st) < 0)
+ goto out_close;
+ /* Wait for file to be written by pull worker */
+ while (!st.st_size) {
+ pthread_yield();
+ if (fstat(fd, &st) < 0)
+ goto out_close;
+ }
+ close(fd);
dprintf("%08"PRIx32" already created\n", idx);
goto out;
}
+ dprintf("%m\n");
ret = SD_RES_EIO;
goto out;
}
@@ -526,7 +542,7 @@ static int object_cache_pull(struct vnode_info *vnodes, struct object_cache *oc,
ret = forward_read_obj_req(&read_req);
if (ret == SD_RES_SUCCESS) {
- dprintf("oid %"PRIx64"pulled successfully\n", oid);
+ dprintf("oid %"PRIx64" pulled successfully\n", oid);
ret = create_cache_object(oc, idx, buf, data_length);
}
free(buf);
--
1.7.10.2
More information about the sheepdog
mailing list