[sheepdog] [PATCH v2 0/2] zookeeper: zookeeper: fix error handling

Liu Yuan namei.unix at gmail.com
Fri May 31 09:25:18 CEST 2013


On 05/30/2013 10:27 PM, MORITA Kazutaka wrote:
> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> 
> v2:
>  - update comment of zk_create_seq_node()
> 
> The first patch fixes a problem under heavy network traffic, and the
> second patch is a clean-up one.
> 
> MORITA Kazutaka (2):
>   zookeeper: retry zk_create_seq_node on retryable error
>   zookeeper: use offsetof to calculate offset
> 
>  sheep/cluster/zookeeper.c |   82 ++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 73 insertions(+), 9 deletions(-)
> 

just FYI, I have met a scenario that 

May 31 15:07:51 [main] zk_queue_push(328) create path:/sheepdog/queue/0000000181, queue_pos:0000000179, len:152  <--- zk seems tried internally and created 3 seq node
May 31 15:07:51 [main] recalculate_vnodes(865) node 7000 has 96 vnodes, free space 355984654336
May 31 15:07:51 [main] recalculate_vnodes(865) node 7001 has 48 vnodes, free space 178013593600
May 31 15:07:51 [main] recalculate_vnodes(865) node 7002 has 48 vnodes, free space 178013462528
May 31 15:07:51 [main] update_epoch_log(42) update epoch: 2, 3
May 31 15:07:52 [rw] prepare_object_list(761) 2
May 31 15:07:52 [rw] wait_get_vdis_done(832) waiting for vdi list
May 31 15:07:52 [rw] wait_get_vdis_done(839) vdi list ready
May 31 15:07:52 [rw] fetch_object_list(670) 10.32.228.126 7001
May 31 15:07:52 [rw] sockfd_cache_get(387) 10.32.228.126:7001, idx 0
May 31 15:07:52 [main] zk_event_handler(803) 1, 179
May 31 15:07:52 [main] zk_queue_pop_advance(366) /sheepdog/queue/0000000179, type:7, len:152, pos:179
May 31 15:07:52 [main] zk_handle_update_node(776) IPv4 ip:10.32.228.126 port:7000 < -- 1
May 31 15:07:52 [main] build_node_list(433) nr_sd_nodes:3
May 31 15:07:52 [main] listen_handler(867) accepted a new connection: 21
May 31 15:07:52 [main] zk_event_handler(803) 1, 180
May 31 15:07:52 [main] zk_queue_pop_advance(366) /sheepdog/queue/0000000180, type:7, len:152, pos:180
May 31 15:07:52 [main] zk_handle_update_node(776) IPv4 ip:10.32.228.126 port:7000 < -- 2
May 31 15:07:52 [main] build_node_list(433) nr_sd_nodes:3
May 31 15:07:52 [main] listen_handler(867) accepted a new connection: 22
May 31 15:07:52 [main] client_handler(808) 1, rx 0, tx 0
May 31 15:07:52 [main] finish_rx(612) 21, 10.32.228.126:37320
May 31 15:07:52 [main] queue_request(353) GET_OBJ_LIST, 1
May 31 15:07:52 [main] zk_event_handler(803) 1, 181
May 31 15:07:52 [io 3112] do_process_work(1376) a1, 0, 2
May 31 15:07:52 [main] zk_queue_pop_advance(366) /sheepdog/queue/0000000181, type:7, len:152, pos:181
May 31 15:07:52 [main] zk_handle_update_node(776) IPv4 ip:10.32.228.126 port:7000 < -- 3

zk_handle_update_node was called three times, even though it doesn't do harm for this event,
but if this is a other event like node event, I guess this will screw the sheep.

Thanks,
Yuan




More information about the sheepdog mailing list