From: Liu Yuan <tailai.ly at taobao.com> Fix the following problem: ... Jun 03 18:39:53 do_local_io(52) 2, ac1a3e000019b7 , 1 Jun 03 18:39:53 object_cache_pull(529) oid ac1a3e000019b7pulled successfully Jun 03 18:39:53 object_cache_pull(529) oid ac1a3e000019b7pulled successfully Jun 03 18:39:53 create_cache_object(451) 000019b7 already created Jun 03 18:39:53 object_cache_rw(415) 000019b7, len 4096, off 1048576 Jun 03 18:39:53 read_cache_object(396) size 0, count:4096, offset 1048576 File exists Jun 03 18:39:53 do_gateway_request(308) failed: 2, ac1a3e000019b7 , 1, 3 Jun 03 18:39:53 gateway_op_done(151) leaving sheepdog cluster ... The problem is, suppose we have two cloned VM reads the same COW oid: A B object_cache_pull() { object_cache_pull() { create_cache_object() { create_cache_object() { open(oid); open(oid) { oid_already_opened() { goto out; } } } } read_cache_object() { read_size != requested_length; return EIO; } wirte(oid); } } The fix looks more a workaround, I will happy to see a better fix. Signed-off-by: Liu Yuan <tailai.ly at taobao.com> --- sheep/object_cache.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/sheep/object_cache.c b/sheep/object_cache.c index e091930..05e50d5 100644 --- a/sheep/object_cache.c +++ b/sheep/object_cache.c @@ -448,9 +448,17 @@ static int create_cache_object(struct object_cache *oc, uint32_t idx, void *buff fd = open(buf.buf, flags, def_fmode); if (fd < 0) { if (errno == EEXIST) { + struct stat st; + fstat(fd, &st); + /* Wait for file to be written by pull worker */ + while (!st.st_size) { + pthread_yield(); + fstat(fd, &st); + } dprintf("%08"PRIx32" already created\n", idx); goto out; } + dprintf("%m\n"); ret = SD_RES_EIO; goto out; } @@ -526,7 +534,7 @@ static int object_cache_pull(struct vnode_info *vnodes, struct object_cache *oc, ret = forward_read_obj_req(&read_req); if (ret == SD_RES_SUCCESS) { - dprintf("oid %"PRIx64"pulled successfully\n", oid); + dprintf("oid %"PRIx64" pulled successfully\n", oid); ret = create_cache_object(oc, idx, buf, data_length); } free(buf); -- 1.7.10.2 |