On Thu, Jul 18, 2013 at 05:29:06PM +0900, MORITA Kazutaka wrote: > At Tue, 16 Jul 2013 17:37:55 +0800, > Liu Yuan wrote: > > > > > > > > > push_join_response() of zk driver are called on all the nodes too. So if > > > > following case happens, can sheep handle it? > > > > > > > > 2 nodes in the cluster {A, B}. And C is joining the cluster. > > > > > > > > A -> push_join_response() and quickly return, watcher of A, B, C is called > > > > to handle EVENT_ACCEPT from A. > > > > B -> push_join_response() slowly return because of network, A, B, C handles > > > > EVENT_ACCEPT from B. > > > > > > > > Simply put, can sheep hanle multiple EVENT_ACCEPT of the same node? > > > > > > I think the answer is yes. > > > > > > - local: The event queue is a mmapped file and guared by flock, so > > > concurrent sd_accept_handler() calls don't happen. > > > > > > - corosync: cdrv_cpg_deliver() ignores the arriving > > > COROSYNC_MSG_TYPE_ACCEPT() if there is no JOIN event in the queue. > > > > Corosync actually never try to send EVENT_ACCEPT more than once for current code > > So no worries about corosync. > > This is a timing problem and it actually happens that multiple nodes > send COROSYNC_MSG_TYPE_ACCEPT events on my environment. In either > way, the corosync driver can handle this problem because it ignores > the second or later COROSYNC_MSG_TYPE_ACCEPT event. This was what I meant, corosync driver ignored redundant ACCEPT event. > > > > > > > > > - zookeeper: push_join_response() just overwrites the znode with > > > EVENT_ACCEPT, and multiple calls of push_join_response() is no > > > problem. > > > > I noticed zookeeper just send one event to watcher on my test box even if there > > are multiple updater to one member of the queue. But I think there is problem > > like above example. I think we need to check if there someone updates the join > > event already in the queue inside push_join_response(), to allow only one > > updater thus one update event to watcher of all nodes. > > It is not easy to make sure that there is only one updater in the > cluster. I think of keeping the master of the zookeeper driver in > this patch. This patch cannot remove the master in either way, and > introduces another complexity rather than simplifying the code. > Why hard? We can read the zk node and check ev->type == EVENT_ACCEPT or not. no? Thanks Yuan |