[Sheepdog] Cluster appears down but nodes report	different	epochs
    MORITA Kazutaka 
    morita.kazutaka at lab.ntt.co.jp
       
    Wed Nov  9 03:44:52 CET 2011
    
    
  
At Tue, 8 Nov 2011 10:56:51 -0500,
Shawn Moore wrote:
> 
> > Probably, we need to add support for using different NICs for data
> > I/Os and monitoring.
> 
> We currently have 4 1Gb nics bonded together using mode 4
> (LACP/802.3ad).  On another note, I am still doing more testing, but
> it almost looks like the TOTAL cluster speed might be limited to 1Gb
> instead of up to 4Gb (I know I can't get that much in truth though,
> but should be higher than 1Gb).  Does anyone have any insight into
> this?  We are using enterprise grade switches with two cards (18Gb/s
> fabric).
How did you measure the total cluster speed?  Could your disks be a
bottleneck?
> 
> 
> >> Shawn, did you format with -H or --nohalt option? If not, might be some
> >> bug in halt path.
> 
> Yes, we need that option as we will have two zones with copies being
> some even number.  So if we don't use -H and one zone goes offline,
> the cluster will quit serving data.
But, cluster info on blade161 says that
  [root at blade161 ~]# collie cluster info
  Cluster status: The sheepdog is stopped doing IO, short of living nodes
I think some bugs are in halt path.
Thanks,
Kazutaka
    
    
More information about the sheepdog
mailing list