At Thu, 10 May 2012 12:09:36 +0800, Liu Yuan wrote: > > On 05/10/2012 10:52 AM, MORITA Kazutaka wrote: > > > At Thu, 10 May 2012 07:55:46 +0800, > > Liu Yuan wrote: > >> > >> On 05/10/2012 12:20 AM, MORITA Kazutaka wrote: > >> > >>> At Wed, 09 May 2012 21:40:45 +0800, > >>> Liu Yuan wrote: > >>>> > >>>> If you are really concerned of this misbehavior, I think we try re-visit > >>>> Wuyue's patch. > >>> > >>> I guess we'll have to agree to disagree on what sheepdog write cache > >>> should be. I think it should work in the same way as a block device > >>> cache, but you think we should limit the usage to simplify the code > >>> and explain what's the limitation in the documentation. Wuyue's patch > >>> looks necessary from my point of view, but unnecessary to you. I > >>> don't intend to complain any more because it's you who implements the > >>> current object cache and it should work as you expect. If I need the > >>> write cache which works like a block device one, I should implement it > >>> as another cache feature. > >>> > >> > >> > >> I've had enough. You and Christoph jump out to bash the object cache > >> time and time again, and seems that never try to read the code to see > >> what is the real culprit and simply ignore or misread my argument. > >> > >> You criticize that object can't handle the mixed writethrough & > >> writeback requests, but it actually does. What cause wrong vdi list > >> output is we don't handle vdi opcode correctly, and I already point to a > > > > Yes, so I wrote "Some vdi operations don't care about cached data" in > > the commit log. It's confusing to call requests from > > write_object()/read_object() writethrough, sorry. > > > >> patch and suggest to 're-visit the patch. well, I think I have clearly > >> stated the intention to merge Wujue's patch(can 're-visit' mean another > > > > Is it really okay to you? If yes, I think it's better. > > > > > Yes, I have talked to Wujue and he will rebase the patch soon. Thanks a lot. > > > > What I wanted to say is that I wonder we need to have multiple cache > > features, and, with regard to the object cache, I should respect your > > opinion because you are the author. > > > >> thing?), but 're-visit' makes you complain again that I look like a > >> dictator who never hear your argument. > >> > >> All the way down you and Christoph simply criticize object cache for it > >> can't handle writethrough & writeback requests, which is completely > >> *false* argument. Even if it holds true, I think the best way is to > >> submit a patch to solve the problem and makes it better instead of a > >> simple patch, just to disable it. Yes you can express your dislike about > > > > No, I don't intend that. > > > > The object cache is really useful when: > > - the gateway nodes have much memory Sorry, this line should be "the gateway nodes have enough storage area". > > - the latency between sheep nodes are high, and it's expensive to > > replicate data (e.g. use sheepdog with WAN) > > > > > Hmm, DIO object cache doesn't need extra client(gateway node) memory, > this should probably be the default option because then it becomes > completely a disk cache that can survive the host crash. I know some users who use sheepdog with two clusters: - Storage cluster Each machine has huge storage area, and sheeps (non-gateway) run on it. - VM cluster Each machine has huge memory to run many VMs, and only tiny storage to store its host operating system, and gateway nodes and qemus run on the machine. For such users, the current object storage is not suitable because the gateway nodes cannot store enough cache data to the small local disk. It makes sense to run the gateway nodes on VM cluster, because the gateway can automatically failover when the storage node fails. If the gateway nodes can store enough cache data to the local disk, the object cache is useful. This is what I meant. > > Also I want to mention that, for starting up hundreds of (cloned) Guest > VM concurrently, object cache seems to be a must on a massive nodes > cluster. We also get the benefit that all the cloned VM on the same node > would share large scale of cached objects. Okay, I agree to keep it default now. Thanks, Kazutaka |