[Sheepdog] [PATCH 2/2] make vdi setattr atomic

Chris Webb chris at arachsys.com
Mon Oct 24 12:36:41 CEST 2011


MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:

> Yes, I pushed many patches which simplify cluster communications, so
> the problem might be solved with the current master branch.  Anyway,
> I'll try to find what caused the problem. :)

Hi Kazutaka. I pulled the current head of master, 5d8ab0de8e. I'm afraid
this now breaks quite spectacularly when I first try to create a drive
(which does a vdi create followed by a handful of vdi setattrs and getattrs)
on my one host, three sheep minicluster: a setattr fails and then

0026# collie vdi list
  name        id    size    used  shared    creation time   vdi id
------------------------------------------------------------------
failed to read object, 800a8ac400000000 Remote node has a new epoch
failed to read a inode header

The three sheep.logs read

Oct 24 10:34:16 read_epoch(2036) failed to read epoch 0
Oct 24 10:34:16 send_join_request(976) ip: 172.16.101.7, port: 7000
Oct 24 10:34:16 main(216) Sheepdog daemon (version 0.2.4-12-g737596b-dirty) started
Oct 24 10:34:16 cluster_queue_request(192) 0x7f9336afa010 84
Oct 24 10:34:16 read_epoch(2036) failed to read epoch 0
Oct 24 10:34:16 update_cluster_info(559) status = 2, epoch = 0, 0, 0
Oct 24 10:34:16 update_cluster_info(559) status = 1, epoch = 1, 0, 1
Oct 24 10:34:16 update_cluster_info(559) status = 1, epoch = 1, 0, 1
Oct 24 10:34:17 cluster_queue_request(192) 0x1c44cb0 82
Oct 24 10:34:17 cluster_queue_request(192) 0x1c44cb0 11
Oct 24 10:34:17 do_lookup_vdi(238) looking for 75bd8c98-3c55-45d3-bc16-3af62601a3a5 36, a8ac4
Oct 24 10:34:17 add_vdi(327) we create a new vdi, 0 75bd8c98-3c55-45d3-bc16-3af62601a3a5 (36) 539545600, vid: a8ac4, base 0, cur 0 
Oct 24 10:34:17 add_vdi(331) qemu doesn't specify the copies... 1
Oct 24 10:34:17 __sd_notify_done(733) done 0 690884
Oct 24 10:34:17 cluster_queue_request(192) 0x1c44cb0 82
Oct 24 10:34:17 cluster_queue_request(192) 0x1c44cb0 89
Oct 24 10:34:17 do_lookup_vdi(238) looking for 75bd8c98-3c55-45d3-bc16-3af62601a3a5 36, a8ac4
Oct 24 10:34:20 cluster_queue_request(192) 0x1c44cb0 82

Oct 24 10:34:16 read_epoch(2036) failed to read epoch 0
Oct 24 10:34:16 send_join_request(976) ip: 172.16.101.7, port: 7001
Oct 24 10:34:16 main(216) Sheepdog daemon (version 0.2.4-12-g737596b-dirty) started
Oct 24 10:34:16 get_vdi_bitmap_from(502) get the vdi bitmap from 172.16.101.7
Oct 24 10:34:16 get_vdi_bitmap_from(502) get the vdi bitmap from 172.16.101.7
Oct 24 10:34:17 update_cluster_info(559) status = 1, epoch = 1, 0, 0
Oct 24 10:34:17 check_epoch(1145) new node version 2 3 1
Oct 24 10:34:17 __sd_notify_done(733) done 0 690884
Oct 24 10:34:17 check_epoch(1145) new node version 2 3 2
Oct 24 10:34:21 check_epoch(1145) new node version 2 3 2
Oct 24 10:35:02 check_epoch(1145) new node version 2 3 2

Oct 24 10:34:16 read_epoch(2036) failed to read epoch 0
Oct 24 10:34:16 send_join_request(976) ip: 172.16.101.7, port: 7002
Oct 24 10:34:16 main(216) Sheepdog daemon (version 0.2.4-12-g737596b-dirty) started
Oct 24 10:34:16 get_vdi_bitmap_from(502) get the vdi bitmap from 172.16.101.7
Oct 24 10:34:16 update_cluster_info(559) status = 1, epoch = 1, 0, 0
Oct 24 10:34:17 update_cluster_info(559) status = 1, epoch = 1, 0, 1
Oct 24 10:34:17 get_obj_list(133) /mnt/sheep-0026-02/obj/00000001/
Oct 24 10:34:17 get_obj_list(133) /mnt/sheep-0026-02/obj/00000001/
Oct 24 10:34:17 get_obj_list(133) /mnt/sheep-0026-02/obj/00000001/
Oct 24 10:34:17 __sd_notify_done(733) done 0 690884

On another run with a clean sheepdog, I get a successful vdi create, followed
by

  vdi setattr -x 3eb42043-a142-4016-9a83-30e56101af23 lock <<< '002689c3-aeab-433d-bafc-acfb95dafe7c:16692:1319452195'

returning exit status 1. The three sheep.logs show

Oct 24 10:29:52 read_epoch(2036) failed to read epoch 0
Oct 24 10:29:52 send_join_request(976) ip: 172.16.101.7, port: 7000
Oct 24 10:29:52 main(216) Sheepdog daemon (version 0.2.4-12-g737596b-dirty) started
Oct 24 10:29:52 cluster_queue_request(192) 0x7f2b33bd9010 84
Oct 24 10:29:52 read_epoch(2036) failed to read epoch 0
Oct 24 10:29:52 update_cluster_info(559) status = 2, epoch = 0, 0, 0
Oct 24 10:29:53 update_cluster_info(559) status = 1, epoch = 1, 0, 1
Oct 24 10:29:53 update_cluster_info(559) status = 1, epoch = 1, 0, 1
Oct 24 10:29:53 __fill_obj_list(1657) try again, 0, 22
Oct 24 10:29:54 cluster_queue_request(192) 0x269ad20 82
Oct 24 10:29:54 cluster_queue_request(192) 0x269ad20 11
Oct 24 10:29:54 do_lookup_vdi(238) looking for 3eb42043-a142-4016-9a83-30e56101af23 36, a6dd79
Oct 24 10:29:54 add_vdi(327) we create a new vdi, 0 3eb42043-a142-4016-9a83-30e56101af23 (36) 539545600, vid: a6dd79, base 0, cur 0 
Oct 24 10:29:54 add_vdi(331) qemu doesn't specify the copies... 1
Oct 24 10:29:55 __sd_notify_done(733) done 0 10935673
Oct 24 10:29:55 cluster_queue_request(192) 0x269ad20 82
Oct 24 10:29:55 cluster_queue_request(192) 0x269ad20 89
Oct 24 10:29:55 do_lookup_vdi(238) looking for 3eb42043-a142-4016-9a83-30e56101af23 36, a6dd79
Oct 24 10:30:02 cluster_queue_request(192) 0x269ad20 82
Oct 24 10:30:02 cluster_queue_request(192) 0x269ad20 82
Oct 24 10:30:02 cluster_queue_request(192) 0x269ad20 89
Oct 24 10:30:02 do_lookup_vdi(238) looking for 3eb42043-a142-4016-9a83-30e56101af23 36, a6dd79
Oct 24 10:30:02 ob_open(449) failed to open /mnt/sheep-0026-00/obj/00000003/20a6dd797853c6e2, No such file or directory
Oct 24 10:30:02 read_object(727) fail 20a6dd797853c6e2 -2
Oct 24 10:30:02 cluster_queue_request(192) 0x269ad20 82
Oct 24 10:30:02 cluster_queue_request(192) 0x269ad20 89
Oct 24 10:30:02 do_lookup_vdi(238) looking for 3eb42043-a142-4016-9a83-30e56101af23 36, a6dd79
Oct 24 10:30:02 cluster_queue_request(192) 0x269ad20 82
Oct 24 10:30:33 cluster_queue_request(192) 0x269ad20 82

Oct 24 10:29:52 read_epoch(2036) failed to read epoch 0
Oct 24 10:29:52 send_join_request(976) ip: 172.16.101.7, port: 7001
Oct 24 10:29:52 main(216) Sheepdog daemon (version 0.2.4-12-g737596b-dirty) started
Oct 24 10:29:53 get_vdi_bitmap_from(502) get the vdi bitmap from 172.16.101.7
Oct 24 10:29:53 get_vdi_bitmap_from(502) get the vdi bitmap from 172.16.101.7
Oct 24 10:29:53 update_cluster_info(559) status = 1, epoch = 1, 0, 0
Oct 24 10:29:55 __sd_notify_done(733) done 0 10935673
Oct 24 10:29:55 check_epoch(1145) new node version 2 3 2

Oct 24 10:29:52 read_epoch(2036) failed to read epoch 0
Oct 24 10:29:52 send_join_request(976) ip: 172.16.101.7, port: 7002
Oct 24 10:29:52 main(216) Sheepdog daemon (version 0.2.4-12-g737596b-dirty) started
Oct 24 10:29:53 get_vdi_bitmap_from(502) get the vdi bitmap from 172.16.101.7
Oct 24 10:29:53 update_cluster_info(559) status = 1, epoch = 1, 0, 0
Oct 24 10:29:53 update_cluster_info(559) status = 1, epoch = 1, 0, 1
Oct 24 10:29:53 get_obj_list(133) /mnt/sheep-0026-02/obj/00000001/
Oct 24 10:29:54 get_obj_list(133) /mnt/sheep-0026-02/obj/00000001/
Oct 24 10:29:54 get_obj_list(133) /mnt/sheep-0026-02/obj/00000001/
Oct 24 10:29:55 __sd_notify_done(733) done 0 10935673
Oct 24 10:30:02 ob_open(449) failed to open /mnt/sheep-0026-02/obj/00000003/20a6dd79886e13c4, No such file or directory

Best wishes,

Chris.



More information about the sheepdog mailing list