On 09/20/2011 08:29 PM, Shawn Moore wrote: >> So I guess you have shutdowned the cluster by 'collie cluster shutdown' >> command, no? > I did not use the shutdown command because I was attempting to > simulate what would happen if an entire zone went down. For us a zone > would be a datacenter (physically separated). > > I did go ahead and attempt to issue it now, but I get: > [root at node174 ~]# collie cluster shutdown > Waiting for other nodes joining > > >> would you please attach the log from the nodes that wouldnot >> join? > You can find the logs from the four nodes here: > http://www.stormpoint.com/files/sheepdog_logs.tgz Hi Shawn, Thanks for your log. It is helpful. I have root-caused the problem (epoch version mismatch during recovery), but unfortunately there is no easy patch yet. Well, I am going to cook a patch exactly handing this problem soon. Yuan |