[sheepdog] [PATCH RFC] sheep: free memory used for exceptional nodes

Mon May 13 18:54:41 CEST 2013

At Wed, 08 May 2013 01:08:18 +0900,
Hitoshi Mitake wrote:
> 
> At Mon, 06 May 2013 23:47:26 +0900,
> Hitoshi Mitake wrote:
> > 
> > At Mon,  6 May 2013 23:46:04 +0900,
> > Hitoshi Mitake wrote:
> > > 
> > > It seems that current clear_exceptional_node_lists() leaks memory used
> > > for representing delayed and failed nodes.
> > 
> > 
> > BTW, I have a question about the mechanism of dealing with exceptional
> > nodes. In sd_join_handler(), if the condition: 
> >        nr_local == nr + nr_failed - nr_delayed_nodes
> > is true, status of sheepdog cluster becomes OK.
> > 
> > I couldn't understand the meaning of the above condition. Because
> > failed nodes exit immediately, so they should not be counted as
> > workable nodes.  (On the other hand, delayed nodes are not counted as 
> > workable. It is also strange from my perspective.)
> > 
> > I'm glad if someone give me an explanation.
> 
> If nobody has a good explanation about this strategy, I'd like to work
> on refactoring it.

I found the clear explanation by Yuan in the following mail.

http://lists.wpkg.org/pipermail/sheepdog/2011-September/001375.html

Refactoring is welcome.  The failed node must be the member in the previous
epoch, but it seems that the current coded doesn't check it at all.  In
addition, we have no test about delayed nodes feature and I'm doubtful about the
correctness of the code.

Thanks,

Kazutaka