[sheepdog-users] Object Cache Performance

Thu Jan 9 12:03:50 CET 2014

Hi,

> This is expected result, cache means we firstly need to cache the data
> somewhere else for furture read/write (thus future read/write accelerated,
> but for the first reference, it will degrade the performance because we do
> some extra IOs)
> 
> It is really necessary to cache the data first for later reference, or how can we
> accelerate the future operations without localy cached data? 
> 

If the data already exists locally, there is no need for acceleration, at least for read operations. 

So as far as I understand the design, it should be possible to

- read data without cache usage if it is locally available
- if data has to be pulled from another node than the cache should be used to accelerate future operations
- if data is written, it should be always first written to the cache and from the cache it is flushed to all nodes

I don't have a very deep knowledge of the internals (also I am learing a lot :-), I would expect that such a design would speed up read access, while write access would stay at the same speed. Since in most cases there are more read than write operations, this would speed up cache usage.

> If your cluster
> size grow, the effect of cache will more pronounced because 99% IO
> requests will be satisified by cache locally (assume no cache overrun, no
> cache miss)

This is only true if all locally used VMs fit into the cache. In this case the optimization I suggest above would not bring any benefit, only speed up first access. Maybe I just need to increase my cache size, which might be an issue when handling large VMs.

Thanks for the explanation

 Regards

Gerald