[Sheepdog] Use custom redundancy for some hosts

Thu Jun 9 20:37:20 CEST 2011

At Thu, 09 Jun 2011 00:50:30 +0200 (CEST),
Frédéric Grelot wrote:
> 
> Hi all, 
> 
> I'm not (yet) using sheepdog, and I thought of a feature that be interesting for some special configurations (including mine, as you would guess...), but that may be quite easy to implement (I don't know anything more than the presentation slides on sheepdog's website and few other informations found on the net, but we never know...).
> 
> Imagine a scenario where the user has several unhomogeneous servers : some have raid storage, some other don't. Let's imagine that the administrator wants a replication factor of 3.
> I know the replication factor is very easy to set up, and that's the power of sheepdog. But in this case, the server that already has some raid is not taken into account. An interesting feature would be that when the administrator creates the store on this server ("$ sheep /store_dir" in the documentation's example), it provides an extra indication telling that this store already has a "replication factor" of 2. Something like
> "$ sheep /store_dir --copies=2"
> 
> That way, when the cluster is created ("collie cluster format --copies=3"), this one store point counts as 2, while the other ones would count as 1.
> 
> An other feature that would be interesting would be some option to force the use of one storage point, when available.
> 
> True, that breaks the "perfect" symmetry of sheepdog, but these two feature together would permit the following kind of scenario : there is one main server (storage point) in the network, reliable and relatively fast. There are few other servers (vm hosts), with "simple" storage (for example, one single disk, with the system and wasted space, no raid), added to the sheepdog cluster. The administrator would thus be able to create a cluster where the vm images are always secured on the main server (by forcing it, setting --copies=2 for this server and copies=3 for the cluster), but still, every read access to the vm's images could be indifferently on this main server or on one of the secondary server, offloading the former. Furthermore, if that server went dead for any reason, the cluster would keep working (and that alone, that is a very big point...)
> 
> I'm not sure if my explanations are clear, but I think that these two options mainly involve cluster creation, not the core of sheepdog. Still, they would prove that sheepdog can be quite easily extended to support lots of use cases.
> 
> Thanks for reading me, and I hope you'll find these suggestions helpful!

I thinks this is really useful especially when reliability of disks is
not symmetric.

These two features need a few changes to the sheepdog consistent
hashing algorithm (a scheme to decide which nodes to store data).  The
changes would be less than 1,000 lines.  If there are some demands for
these features, I'll be willing to support them.  And, of course, I'm
happy if someone implement them :)

> 
> (Furthermore, I also hope that sheepdog will enter libvirt's perimeter soon, that'll be a really great news!)

Libvirt already supports Sheepdog since version 0.8.7.

Thanks,

Kazutaka