[sheepdog-users] Corosync Vs Zookeeper Backends

Andrew J. Hobbs ajhobbs at desu.edu
Fri Mar 14 19:15:27 CET 2014


Out of curiosity, is there a transcription error on the listed 
commands?  Corosync has /meta,...  There is a space between /meta and 
the volumes for Zookeeper.  If that is, in fact, in the command, then 
you aren't actually striping data across discs for zookeeper but just 
writing to the single /meta location.  That would indeed explain the 
40MB/s speeds.

On 03/14/2014 10:58 AM, Liu Yuan wrote:
> On Fri, Mar 14, 2014 at 02:43:59PM +0000, Aydelott, Ryan M. wrote:
>> Interesting note on the data management, I have not dug much into the internals with Sheepdog yet - but the only possible explanation would be if any metadata on file object positioning was being retrieved through the backend.
> Each volume has only 4MB meta object that track the allocatin bitmap for objects
> mainly for thin provision. We don't record any placement meta data at all,
> instead we 100% rely on consistent hash for placement of all the objects, resulting
> in astonishing load and space balance. We have run 150 nodes with 12 disks on
> each node and confirmed the evenly distributed load and space. As you noticed,
> we don't have meta servers.
>
>> I agree with your statements as the IO on the zookeeper nodes is very small/nowhere near the data we are pushing throughout the sheepd members.
>>
>> Sheepdog is being started as follows for each test iteration:
>>
>> Corosync: sheep -n -c corosync:172.21.5.0 /meta,/var/lib/sheepdog/disc0,/var/lib/sheepdog/disc1,/var/lib/sheepdog/disc2,/var/lib/sheepdog/disc3,/var/lib/sheepdog/disc4,/var/lib/sheepdog/disc5,/var/lib/sheepdog/disc6,/var/lib/sheepdog/disc7,/var/lib/sheepdog/disc8,/var/lib/sheepdog/disc9,/var/lib/sheepdog/disc10,/var/lib/sheepdog/disc11,/var/lib/sheepdog/disc12,/var/lib/sheepdog/disc13
>>
>> Zookeeper: sheep -n -c zookeeper:172.21.5.161:2181,172.21.5.162:2181,172.21.5.163:2181,172.21.5.164:2181,172.21.5.165:2181 /meta /var/lib/sheepdog/disc0,/var/lib/sheepdog/disc1,/var/lib/sheepdog/disc2,/var/lib/sheepdog/disc3,/var/lib/sheepdog/disc4,/var/lib/sheepdog/disc5,/var/lib/sheepdog/disc6,/var/lib/sheepdog/disc7,/var/lib/sheepdog/disc8,/var/lib/sheepdog/disc9,/var/lib/sheepdog/disc10,/var/lib/sheepdog/disc11,/var/lib/sheepdog/disc12,/var/lib/sheepdog/disc13
>>
>> The driver we wrote/use is: https://github.com/devoid/nova/tree/sheepdog-nova-support-havana
>>
>> Which builds out libvirt.xml as follows:
>>
>>      <disk type="network" device="disk">
>>        <driver name="qemu" cache="writethrough"/>
>>        <source protocol="sheepdog" name="//172.21.5.141:7000/instance_f9dc065b-d05d-47cb-a3e6-b02049f049df_disk"/>
>>        <target bus="virtio" dev="vda"/>
>>      </disk>
>>
> Hmm, since you don't use object cache, you might try set 'cache=none' to save
> one extra internal flush operation in QEMU. But this won't make big difference.
>
> I have no idea why you see such a huge difference with zookeeper. Wierd. Is this
> result reproduciable? I guess something wrong with other confiuration or system.
>
> P.S. I have long wanted someone who can access IB to develop native IB network
> transport code for sheepdog instead of ip-over-IB.
>
> Thanks
> Yuan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ajhobbs.vcf
Type: text/x-vcard
Size: 353 bytes
Desc: ajhobbs.vcf
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140314/dbf6ba20/attachment-0005.vcf>


More information about the sheepdog-users mailing list