[sheepdog-users] device priority by performance and number of accesses

Joseph Glanville joseph at cloudscaling.com
Thu Jun 6 04:56:17 CEST 2013

Hot block migration and storage tiering is a good idea but takes time
to implement.
Ideally this would be part of the allocation scheme cluster wide.

Instead of each node performing tiering, you split the cluster up into
it's tiers.
Thus you have num_of_tiers in virtual clusters. You migrate data from
one virtual cluster to another based on it's usage pattern.
Doing the stat tracking can be somewhat difficult in a distributed
system though. Tier uses an amortized offline method which is probably
the best idea.
This is done by recording accesses to a fast log which is flushed
every now and then to disk.
A background job then would run, collecting the logs and determining
the hottest blocks and migrating them. This process should probably
target blocks that are currently writeable. Commonly accessed
read-only blocks can simply be cached client side.

The main difficulty is in representing this architecture in sheepdogs
current model (maybe special vnodes?) and writing a fast in memory
access log. (b+tree usually). It's also worth trying to look at the
accesses in relation to the position in the VDI. Tiering will benefit
highly random workloads the most that write to only a portion of the
object. We can use the access data to find sequential workloads and
try align those objects as well similar to how enterprise SANs work.


PS: The concept of VDI being a special type of "multi-part object"
should probably stand but we should also have some concept of standard
multi-part objects for when we try implement things like
sheepfs/swift/s3. Because it will be highly beneficial to split
objects up into multiple parts.

On Wed, Jun 5, 2013 at 9:11 AM, Valerio Pachera <sirio81 at gmail.com> wrote:
> When sheep writes on multi device is writes like raid 0.
> This should give us very good performance.
> I read a bit about http://sourceforge.net/projects/tier/.
> The idea behind it is to write the most accessed chunks in the faster devices.
> In case we have a server with some mechanical disks and one or more
> ssd this approach may raise performance even more.
> The idea may be implement in later versions of sheepdog if you think
> it worth it.
> I was discussing with Alessandro how to get the best out from some ssd
> we have and wish to use into the cluster.
> Using 250G of disk for cache is even too much  and may lead to trouble
> when it's time to flush it.
> What do you think about it?
> --
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users

More information about the sheepdog-users mailing list