[sheepdog] [PATCH] sheep: remove master node
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Thu Jul 18 10:29:06 CEST 2013
At Tue, 16 Jul 2013 17:37:55 +0800,
Liu Yuan wrote:
>
> >
> > > push_join_response() of zk driver are called on all the nodes too. So if
> > > following case happens, can sheep handle it?
> > >
> > > 2 nodes in the cluster {A, B}. And C is joining the cluster.
> > >
> > > A -> push_join_response() and quickly return, watcher of A, B, C is called
> > > to handle EVENT_ACCEPT from A.
> > > B -> push_join_response() slowly return because of network, A, B, C handles
> > > EVENT_ACCEPT from B.
> > >
> > > Simply put, can sheep hanle multiple EVENT_ACCEPT of the same node?
> >
> > I think the answer is yes.
> >
> > - local: The event queue is a mmapped file and guared by flock, so
> > concurrent sd_accept_handler() calls don't happen.
> >
> > - corosync: cdrv_cpg_deliver() ignores the arriving
> > COROSYNC_MSG_TYPE_ACCEPT() if there is no JOIN event in the queue.
>
> Corosync actually never try to send EVENT_ACCEPT more than once for current code
> So no worries about corosync.
This is a timing problem and it actually happens that multiple nodes
send COROSYNC_MSG_TYPE_ACCEPT events on my environment. In either
way, the corosync driver can handle this problem because it ignores
the second or later COROSYNC_MSG_TYPE_ACCEPT event.
>
> >
> > - zookeeper: push_join_response() just overwrites the znode with
> > EVENT_ACCEPT, and multiple calls of push_join_response() is no
> > problem.
>
> I noticed zookeeper just send one event to watcher on my test box even if there
> are multiple updater to one member of the queue. But I think there is problem
> like above example. I think we need to check if there someone updates the join
> event already in the queue inside push_join_response(), to allow only one
> updater thus one update event to watcher of all nodes.
It is not easy to make sure that there is only one updater in the
cluster. I think of keeping the master of the zookeeper driver in
this patch. This patch cannot remove the master in either way, and
introduces another complexity rather than simplifying the code.
Thanks,
Kazutaka
More information about the sheepdog
mailing list