[Sheepdog] sheepdog and RAID

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed Mar 23 18:19:36 CET 2011


At Mon, 21 Mar 2011 08:21:48 -0700 (PDT),
Ski Mountain wrote:
> >> One problem I do see with starting many sheep daemons on the same server that 
> 
> >> has many disks is that (especially on small clusters) it is possible for all 
> >> data for one or many Virtual Machine to be stored on one physical server.  
> >
> >Could you explain more details about this?  Fixing this looks the
> >right way to go to me.
> 
> Since as far as I understand the ring architecture, each machine sits on a ring 
> and VM's are RAIDed across the ring.  
> 
> 
> Having a sheep on a server that has many disks (these days it is very easy to 
> have 10+ disks on one server) , and the best way to set up a machine with many 
> disks is to assign a sheep daemon to each disk.  I am simply saying it would be 
> good if there where some additional sanity checks put into the sheepdog 
> architecture so that it is not possible for a VM to be stored entirely on one 
> server with many disks.  I know this would be the exception, not the rule, but 
> would just like all bases to be covered.

I think the right way is supporting location aware data placement.
There is a need for placing replicated data on different racks or
different data centers.  If we could support the location aware data
placement, it would also solve your problem; placing replicated data
on different machines.

> 
> Or is that already listed under the to do as "better data re-balancing"
> 
> >
> >> Would it be possible to do say
> >> sheep /store_disk0  /store_disk1  /store_disk2  /store_disk3  /store_disk4  
> >> /store_disk5 /store_disk6
> >> So all mount points on a server would be handled by the same sheep using 
> >> multiple threads
> >> 
> >> Add a mount point 
> >> sheep -a /store_disk7
> >> 
> >> Remove a mount point
> >> sheep -r /store_disk7
> >
> >It is possible to support multiple disks.  But if running multiple
> >daemons solves the problem, I'd like to keep the current simple design
> >(one daemon for one disk).+
> 
> Would it be possible to run one sheep daemon per a server, but spawn a child 
> process for each disk.  Then use the above method for adding and removing disks 
> for the machine so that the administrator does not have to keep track of port 
> numbers?

Yes, there is no technical reasons why we cannot implement that in
Sheepdog.  But if the reason to support the above usages is only to
avoid managing port numbers, I think it is better to create a wrapper
script of the sheep daemon and support them in the wrapper.  You can
achieve it easily without touching the sheep daemon.


Thanks,

Kazutaka



More information about the sheepdog mailing list