[sheepdog] [PATCH 0/8] add basic raid support

Liu Yuan namei.unix at gmail.com
Mon Mar 11 03:17:03 CET 2013


On 03/11/2013 08:55 AM, MORITA Kazutaka wrote:
> At Sun, 10 Mar 2013 22:19:21 +0800,
> Liu Yuan wrote:
>>
>> From: Liu Yuan <tailai.ly at taobao.com>
>>
>> This patch set implements the basic RAID support that aims to manage multiple
>> disks in one node. 
>>
>> The basic idea of this RAID is implement RAID-0 like mechanism that distributes
>> sheep objects on the local disks without parity or replicating, which instead 
>> relies on the sheepdog's replicated storage to recover the lost objects on the 
>> faulty disk.
>>
>> The raid module use a private consistent hash ring per sheep for object
>> distributing, which allow raid layer completely transparent to sheep node
>> managent. This means that hot plug/unplug the disk (include faulty disks) to the
>> local sheep won't cause object movement between the nodes.
>>
>> This series just implement basic object distribution control of raid module. The
>> missing part is internal object recovery between local disks inside the node and
>> collie command to hot plug/unplug the disk into the sheep daemon, which is meant
>> to be written by the next series.
> 
> So can we set the redundancy level of the raid module in the next series?
> 

No. I plan to recover lost objects from other sheep. Maybe What I wrote
is kind of misleading. With copy=1, we maximize the disk space
utilization and performance, resulting in simpler code too.

>>
>> To enable raid:
>>  $ sheep /path/to/meta-store,/path/to/disk1,/path/to/disk2[,...]
>>
>> We need pass meta-store, which holds sheep's meta information like epoch, config
>> as the first parameter.
> 
> I prefer more symmetric design.  Can the meta-store be a SPOF of the
> node even if we can set the redundancy level in the next series?
> 

Yeah, meta-store is SPOF as before for one disk per sheep configuration.
Replicating all the meta files across all the disks looks more
complicated to me, so we need to management code to main replicated
files. If we have many disks like 256, we need to maintain 256 copies?
Looks kind of over-redundant to me.

I think meta-store can be placed at operating system partition, which is
a SPOF too. And if this partition is broken, however many replica we
have for sheep will not stop this sheep from being broken. My argument
is that placing meta-store with OS partition is simple and achieve the
same actual level of fault tolerant. (People can make RAID for this
partition).

Thanks,
Yuan




More information about the sheepdog mailing list