[Sheepdog] sheepdog and RAID

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed Mar 23 18:19:36 CET 2011

At Mon, 21 Mar 2011 08:21:48 -0700 (PDT),
Ski Mountain wrote:
> >> One problem I do see with starting many sheep daemons on the same server that 
> >> has many disks is that (especially on small clusters) it is possible for all 
> >> data for one or many Virtual Machine to be stored on one physical server.  
> >
> >Could you explain more details about this?  Fixing this looks the
> >right way to go to me.
> Since as far as I understand the ring architecture, each machine sits on a ring 
> and VM's are RAIDed across the ring.  
> Having a sheep on a server that has many disks (these days it is very easy to 
> have 10+ disks on one server) , and the best way to set up a machine with many 
> disks is to assign a sheep daemon to each disk.  I am simply saying it would be 
> good if there where some additional sanity checks put into the sheepdog 
> architecture so that it is not possible for a VM to be stored entirely on one 
> server with many disks.  I know this would be the exception, not the rule, but 
> would just like all bases to be covered.

I think the right way is supporting location aware data placement.
There is a need for placing replicated data on different racks or
different data centers.  If we could support the location aware data
placement, it would also solve your problem; placing replicated data
on different machines.

> Or is that already listed under the to do as "better data re-balancing"
> >
> >> Would it be possible to do say
> >> sheep /store_disk0  /store_disk1  /store_disk2  /store_disk3  /store_disk4  
> >> /store_disk5 /store_disk6
> >> So all mount points on a server would be handled by the same sheep using 
> >> multiple threads
> >> 
> >> Add a mount point 
> >> sheep -a /store_disk7
> >> 
> >> Remove a mount point
> >> sheep -r /store_disk7
> >
> >It is possible to support multiple disks.  But if running multiple
> >daemons solves the problem, I'd like to keep the current simple design
> >(one daemon for one disk).+
> Would it be possible to run one sheep daemon per a server, but spawn a child 
> process for each disk.  Then use the above method for adding and removing disks 
> for the machine so that the administrator does not have to keep track of port 
> numbers?

Yes, there is no technical reasons why we cannot implement that in
Sheepdog.  But if the reason to support the above usages is only to
avoid managing port numbers, I think it is better to create a wrapper
script of the sheep daemon and support them in the wrapper.  You can
achieve it easily without touching the sheep daemon.



More information about the sheepdog mailing list