On Mon, Jul 30, 2012 at 4:35 PM, Yunkai Zhang <yunkai.me at gmail.com> wrote: > On Mon, Jul 30, 2012 at 4:24 PM, Liu Yuan <namei.unix at gmail.com> wrote: >> On 07/30/2012 04:17 PM, Yunkai Zhang wrote: >>> Can you show more information to me? it works well in my testing, and >> >> What kind of information? I just asked, if your patch set can work with >> following situation: >> >> while you do the manual recovery (be it group join or group kill), >> some of other nodes fails unexpectedly, then what the result of it? For e.g >> 0 we have 3 nodes with 2 copies (d0,d1,d2) >> 1 start manual group add, add node x1,x2 >> 2 some nodes d1,d2 goes down meantime <-- no membership event >> propagate to cluster? If no, what do we handle the IO routed to failed >> nodes x1, x2? > > Good question, in order to simplify these patchset, I'll let sheep continue > to process LEAVE event even if we have start delay recovery. Or just let sheep retry until recovery finished, I need more testing for this complicated situation. > >> 3 stop manual group add. >> >> the expected result is (d0, x1, x2), how is the epoch looks like? like >> follwoing? >> >> epoch 1: (d0, d1, d2) >> epoch 2: (d0, x1, x3) >> >> Thanks, >> Yuan > > > > -- > Yunkai Zhang > Work at Taobao -- Yunkai Zhang Work at Taobao |