[sheepdog] [PATCH v2] object cache: fix a race problem
Liu Yuan
namei.unix at gmail.com
Mon Jun 4 04:04:18 CEST 2012
On 06/03/2012 08:13 PM, Liu Yuan wrote:
> Fix the following problem:
> ...
> Jun 03 18:39:53 do_local_io(52) 2, ac1a3e000019b7 , 1
> Jun 03 18:39:53 object_cache_pull(529) oid ac1a3e000019b7pulled successfully
> Jun 03 18:39:53 object_cache_pull(529) oid ac1a3e000019b7pulled successfully
> Jun 03 18:39:53 create_cache_object(451) 000019b7 already created
> Jun 03 18:39:53 object_cache_rw(415) 000019b7, len 4096, off 1048576
> Jun 03 18:39:53 read_cache_object(396) size 0, count:4096, offset 1048576 File exists
> Jun 03 18:39:53 do_gateway_request(308) failed: 2, ac1a3e000019b7 , 1, 3
> Jun 03 18:39:53 gateway_op_done(151) leaving sheepdog cluster
> ...
>
> The problem is, suppose we have two cloned VM reads the same COW oid:
>
> A B
>
> object_cache_pull() { object_cache_pull() {
> create_cache_object() { create_cache_object() {
> open(oid);
> open(oid) {
> oid_already_opened() {
> goto out;
> }
> }
> }
> }
> read_cache_object() {
> read_size != requested_length;
> return EIO;
> }
> wirte(oid);
> }
> }
>
> The fix looks more a workaround, I will happy to see a better fix.
Dropped, the real problem is fcntl, which doesn't support lock across FD
even in the same process.
Thanks,
Yuan
More information about the sheepdog
mailing list