On 05/31/2013 03:25 PM, Liu Yuan wrote: > On 05/30/2013 10:27 PM, MORITA Kazutaka wrote: >> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> >> >> v2: >> - update comment of zk_create_seq_node() >> >> The first patch fixes a problem under heavy network traffic, and the >> second patch is a clean-up one. >> >> MORITA Kazutaka (2): >> zookeeper: retry zk_create_seq_node on retryable error >> zookeeper: use offsetof to calculate offset >> >> sheep/cluster/zookeeper.c | 82 ++++++++++++++++++++++++++++++++++++++++----- >> 1 file changed, 73 insertions(+), 9 deletions(-) >> > > just FYI, I have met a scenario that > > May 31 15:07:51 [main] zk_queue_push(328) create path:/sheepdog/queue/0000000181, queue_pos:0000000179, len:152 <--- zk seems tried internally and created 3 seq node > May 31 15:07:51 [main] recalculate_vnodes(865) node 7000 has 96 vnodes, free space 355984654336 > May 31 15:07:51 [main] recalculate_vnodes(865) node 7001 has 48 vnodes, free space 178013593600 > May 31 15:07:51 [main] recalculate_vnodes(865) node 7002 has 48 vnodes, free space 178013462528 > May 31 15:07:51 [main] update_epoch_log(42) update epoch: 2, 3 > May 31 15:07:52 [rw] prepare_object_list(761) 2 > May 31 15:07:52 [rw] wait_get_vdis_done(832) waiting for vdi list > May 31 15:07:52 [rw] wait_get_vdis_done(839) vdi list ready > May 31 15:07:52 [rw] fetch_object_list(670) 10.32.228.126 7001 > May 31 15:07:52 [rw] sockfd_cache_get(387) 10.32.228.126:7001, idx 0 > May 31 15:07:52 [main] zk_event_handler(803) 1, 179 > May 31 15:07:52 [main] zk_queue_pop_advance(366) /sheepdog/queue/0000000179, type:7, len:152, pos:179 > May 31 15:07:52 [main] zk_handle_update_node(776) IPv4 ip:10.32.228.126 port:7000 < -- 1 > May 31 15:07:52 [main] build_node_list(433) nr_sd_nodes:3 > May 31 15:07:52 [main] listen_handler(867) accepted a new connection: 21 > May 31 15:07:52 [main] zk_event_handler(803) 1, 180 > May 31 15:07:52 [main] zk_queue_pop_advance(366) /sheepdog/queue/0000000180, type:7, len:152, pos:180 > May 31 15:07:52 [main] zk_handle_update_node(776) IPv4 ip:10.32.228.126 port:7000 < -- 2 > May 31 15:07:52 [main] build_node_list(433) nr_sd_nodes:3 > May 31 15:07:52 [main] listen_handler(867) accepted a new connection: 22 > May 31 15:07:52 [main] client_handler(808) 1, rx 0, tx 0 > May 31 15:07:52 [main] finish_rx(612) 21, 10.32.228.126:37320 > May 31 15:07:52 [main] queue_request(353) GET_OBJ_LIST, 1 > May 31 15:07:52 [main] zk_event_handler(803) 1, 181 > May 31 15:07:52 [io 3112] do_process_work(1376) a1, 0, 2 > May 31 15:07:52 [main] zk_queue_pop_advance(366) /sheepdog/queue/0000000181, type:7, len:152, pos:181 > May 31 15:07:52 [main] zk_handle_update_node(776) IPv4 ip:10.32.228.126 port:7000 < -- 3 > > zk_handle_update_node was called three times, even though it doesn't do harm for this event, > but if this is a other event like node event, I guess this will screw the sheep. > Oops, this was caused by my patch. Thanks, Yuan |