[sheepdog-users] Object Cache Performance

Thu Jan 9 12:37:28 CET 2014

On Thu, Jan 09, 2014 at 12:03:50PM +0100, Gerald Richter - ECOS wrote:
> Hi,
> 
> > This is expected result, cache means we firstly need to cache the data
> > somewhere else for furture read/write (thus future read/write accelerated,
> > but for the first reference, it will degrade the performance because we do
> > some extra IOs)
> > 
> > It is really necessary to cache the data first for later reference, or how can we
> > accelerate the future operations without localy cached data? 
> > 
> 
> If the data already exists locally, there is no need for acceleration, at least for read operations. 
> 
> So as far as I understand the design, it should be possible to
> 
> - read data without cache usage if it is locally available
> - if data has to be pulled from another node than the cache should be used to accelerate future operations
> - if data is written, it should be always first written to the cache and from the cache it is flushed to all nodes

Yes, we can do as you suggested, but the tradeoff is more complexity of code.
And the effect is not good for all cases. E.g,

Tow requests, first is read, second one is write. So with your suggestion,

1 read data locally without pulling it to the cache (faster)
2 pull data to cacche and write (slower)

So for overall performance, your suggestion will not work well and the
acceleration of your approach will limited to the case only that
1) read requests only and no successive write requests.
2) all the data is in the local 

In a VM environment and large scale cluter, above 2 conditions are very hard
to meet.

> 
> I don't have a very deep knowledge of the internals (also I am learing a lot :-), I would expect that such a design would speed up read access, while write access would stay at the same speed. Since in most cases there are more read than write operations, this would speed up cache usage.
> 
> > If your cluster
> > size grow, the effect of cache will more pronounced because 99% IO
> > requests will be satisified by cache locally (assume no cache overrun, no
> > cache miss)
> 
> This is only true if all locally used VMs fit into the cache. In this case the optimization I suggest above would not bring any benefit, only speed up first access. Maybe I just need to increase my cache size, which might be an issue when handling large VMs.
> 

The object cache is deisgned for large cache size, targeted for a dedicated SSD
device like 512G, which is unlikely to be overrun because 10% of space is still
avaiable(in my example, 51.2G) before reclaim begin to reclaim objects.
(reclaiming clean objects(non-dirty) will be superfast because just need a local
remove opreation).

But this doesn't mean we don't need 'direct reclaim' as I mentioned previously
that will rule out the overrun case completely.

We just need time to implement it. (also true for time-expire reclaimer)

Thanks
Yuan