[sheepdog-users] hyper volume issue

Hitoshi Mitake mitake.hitoshi at gmail.com
Tue Sep 22 05:05:25 CEST 2015


On Tue, Sep 22, 2015 at 4:01 AM, Glen Aidukas
<GAidukas at behaviormatrix.com> wrote:
> Hitoshi,
>
> Thanks for your response and like I said, I will be creating a new test cluster with the newer code and test the -z 23 option.
>
> I was wondering what the practical limits are.  I guess with -z 23 I can get a 8TB VDI but I currently have some VMs on ceph with over 16TB and may be looking to go as large as 32TB.  I know I can make multiple VDIs and even use software raid0 to join several 4TB or 8TB vdi's into one large volume but I was wondering what the targeted design limits are for sheepdog.  Also I guess I could use something like -z 24 or 25 and get larger objects but doesn’t larger objects cause some performance issues?

The metadata object (inode) of sheepdog can point 1M objects. 1M * 4MB
(default size of objects) = 4TB, so the default max size of VDIs is
4TB. The simplest way of providing VDIs 4TB < is changing the size of
objects.

There are pros and cons of larger objects.
pros:
 - reduce a possibility of COW. It means frequency of metadata object
can be reduced so write performance can be improved if you take many
snapshots.

cons:
 - if there are frequently read objects, such objects can result burst
read traffic to few nodes. You can find this problem in section 2.5 of
the GFS paper http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf

Thanks,
Hitoshi

>
> Thanks,
>
> Glen
>
>
>
> -----Original Message-----
> From: Hitoshi Mitake [mailto:mitake.hitoshi at gmail.com]
> Sent: Monday, September 21, 2015 10:33 AM
> To: Glen Aidukas
> Cc: sheepdog-users (sheepdog-users at lists.wpkg.org)
> Subject: Re: [sheepdog-users] hyper volume issue
>
> On Mon, Sep 21, 2015 at 10:57 PM, Glen Aidukas <GAidukas at behaviormatrix.com> wrote:
>> Hitoshi,
>>
>> In a  few days I will be setting up a new test cluster with three nodes, 1 journal and 3 data drives each and also 10gbit networking.  I will test with the new branch then.
>
> Please never use journaling, it doesn't contribute to performance and degrades stability. It will be removed in the future.
>
>>
>> What is the effect of using  '-z 23' ?
>
> -z is the block size shift. -z 23 means a newly created VDI will have
> 1 << 23 B (8MB) object. It will solve the 4TB limitation.
>
> Thanks,
> Hitoshi
>
>
>>
>> Regards,
>>
>> Glen
>>
>>
>> -----Original Message-----
>> From: Hitoshi Mitake [mailto:mitake.hitoshi at gmail.com]
>> Sent: Monday, September 21, 2015 4:10 AM
>> To: Glen Aidukas
>> Cc: sheepdog-users (sheepdog-users at lists.wpkg.org)
>> Subject: Re: [sheepdog-users] hyper volume issue
>>
>> Hi Glen,
>>
>> hypervolume isn't a feature for VDIs, it is only used for http object store.
>>
>> Could you try below command line with master branch:
>> $ dog vdi create <vdi name> 5T -z 23
>>
>> The master branch supports VDIs larger than 4TB. The feature isn't backported to 0.9.x yet.
>>
>> Thanks,
>> Hitoshi
>>
>>
>> On Fri, Sep 18, 2015 at 10:29 PM, Glen Aidukas <GAidukas at behaviormatrix.com> wrote:
>>> Hello,
>>>
>>>
>>>
>>> I’m seeing an issue with hyper volumes on my two node test cluster
>>> using v0.9.2.  While performing some tests, I tried to create a 5TB
>>> vdi and saw that normal volumes are limited to 4TB and that I needed
>>> to use the –y switch to create a Hyper Volume.  This worked when
>>> creating it but then was not able to start my vm while connecting to
>>> it.  I then tried to delete the hyper volume and the cluster then
>>> crashed.  When I restarted the cluster I was still having issues.  I
>>> was able to reformat the cluster, reimport my test vms and all was working again.
>>>
>>>
>>>
>>> To test and see if  trying to use the hv vdi is what caused the issue
>>> I then created a 5TB hv-vdi and then delete it and I got the same issue.
>>>
>>>
>>>
>>> -Glen
>>>
>>>
>>>
>>> Here is a recreation of the issue:
>>>
>>>
>>>
>>> Take special note to the log file showing:
>>>
>>> Sep 18 09:08:25  EMERG [deletion] traverse_btree(190) PANIC: This
>>> B-tree not support depth 0?
>>>
>>>
>>>
>>>
>>>
>>> root at pmsd-a01:~# sheep -v
>>>
>>> Sheepdog daemon version 0.9.2
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog node md info --all
>>>
>>> Id      Size    Used    Avail   Use%    Path
>>>
>>> Node 0:
>>>
>>> 0      1.8 TB  10 GB   1.8 TB    0%    /var/lib/sheepdog//disc1
>>>
>>> 1      1.8 TB  9.7 GB  1.8 TB    0%    /var/lib/sheepdog//disc2
>>>
>>> 2      1.8 TB  10 GB   1.8 TB    0%    /var/lib/sheepdog//disc3
>>>
>>> Node 1:
>>>
>>> 0      1.8 TB  10 GB   1.8 TB    0%    /var/lib/sheepdog//disc1
>>>
>>> 1      1.8 TB  9.7 GB  1.8 TB    0%    /var/lib/sheepdog//disc2
>>>
>>> 2      1.8 TB  10 GB   1.8 TB    0%    /var/lib/sheepdog//disc3
>>>
>>>
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog node info
>>>
>>> Id      Size    Used    Avail   Use%
>>>
>>> 0      5.4 TB  30 GB   5.3 TB    0%
>>>
>>> 1      5.4 TB  30 GB   5.3 TB    0%
>>>
>>> Total   11 TB   60 GB   11 TB     0%
>>>
>>>
>>>
>>> Total virtual image size        512 GB
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog vdi list
>>>
>>>   Name        Id    Size    Used  Shared    Creation time   VDI id  Copies
>>> Tag
>>>
>>>   vm-60153-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:13   2e24f0
>>> 2
>>>
>>>   vm-60154-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:24   57e86b
>>> 2
>>>
>>>   vm-60152-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:00   794d35
>>> 2
>>>
>>>   vm-60151-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 10:35   c286fb
>>> 2
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog vdi create test-5tb-hv 5T -y
>>>
>>> root at pmsd-a01:~# dog vdi list
>>>
>>>   Name        Id    Size    Used  Shared    Creation time   VDI id  Copies
>>> Tag
>>>
>>>   vm-60153-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:13   2e24f0
>>> 2
>>>
>>>   test-5tb-hv         0  5.0 TB  0.0 MB  0.0 MB 2015-09-18 09:07   3af976
>>> 2
>>>
>>>   vm-60154-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:24   57e86b
>>> 2
>>>
>>>   vm-60152-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:00   794d35
>>> 2
>>>
>>>   vm-60151-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 10:35   c286fb
>>> 2
>>>
>>>
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog vdi delete test-5tb-hv
>>>
>>> failed to read a response
>>>
>>>
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog node info
>>>
>>> failed to connect to 127.0.0.1:7000: Connection refused
>>>
>>> failed to connect to 127.0.0.1:7000: Connection refused
>>>
>>> Failed to get node list
>>>
>>>
>>>
>>> root at pmsd-a01:~# service sheepdog restart
>>>
>>> Restarting Sheepdog Server: sheepdog.
>>>
>>>
>>>
>>> root at pmsd-a02:~# service sheepdog restart              (on second node)
>>>
>>> Restarting Sheepdog Server: sheepdog.
>>>
>>>
>>>
>>>
>>>
>>> root at pmsd-a01:~# dog vdi list
>>>
>>>
>>>
>>> PANIC: Depth of B-tree is out of range(depth: 0)
>>>
>>>   Name        Id    Size    Used  Shared    Creation time   VDI id  Copies
>>> Tag
>>>
>>>   vm-60153-disk-1     0  128 GB  7.5 GB  0.0 MB 2015-09-17 11:13   2e24f0
>>> 2
>>>
>>> dog exits unexpectedly (Aborted).
>>>
>>> sh: 1: addr2line: not found
>>>
>>> :
>>>
>>> sh: 1: addr2line: not found
>>>
>>> :
>>>
>>> sh: 1: addr2line: not found
>>>
>>> :
>>>
>>> sh: 1: addr2line: not found
>>>
>>>
>>>
>>>
>>>
>>> root at pmsd-a01:~# less sheedog.log
>>>
>>> Sep 18 09:07:52   INFO [main] rx_main(830) req=0x7f1d64032760, fd=189,
>>> client=127.0.0.1:59938, op=NEW_VDI, data=(not string)
>>>
>>> Sep 18 09:07:52   INFO [main] tx_main(882) req=0x7f1d64032760, fd=189,
>>> client=127.0.0.1:59938, op=NEW_VDI, result=00
>>>
>>> Sep 18 09:08:25   INFO [main] rx_main(830) req=0x7f1d6406aa40, fd=189,
>>> client=127.0.0.1:59970, op=DEL_VDI, data=(not string)
>>>
>>> Sep 18 09:08:25  EMERG [deletion] traverse_btree(190) PANIC: This
>>> B-tree not support depth 0
>>>
>>> Sep 18 09:08:25  EMERG [deletion] crash_handler(268) sheep exits
>>> unexpectedly (Aborted).
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /usr/sbin/sheep(+0xb19b) [0x7f1d7c55319b]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /lib/x86_64-linux-gnu/libpthread.so.0(+0xf09f) [0x7f1d7c10f09f]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7f1d7b2e2164]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7f1d7b2e53df]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /usr/sbin/sheep(+0x41566) [0x7f1d7c589566]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /usr/sbin/sheep(+0x16415) [0x7f1d7c55e415]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /usr/sbin/sheep(+0x3daba) [0x7f1d7c585aba]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7f1d7c106b4f]
>>>
>>> Sep 18 09:08:25  EMERG [deletion] sd_backtrace(840)
>>> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7f1d7b38b95c]
>>>
>>> Sep 18 09:08:54   INFO [main] md_add_disk(343) /var/lib/sheepdog//disc1,
>>> vdisk nr 1834, total disk 1
>>>
>>> Sep 18 09:08:54   INFO [main] md_add_disk(343) /var/lib/sheepdog//disc2,
>>> vdisk nr 1834, total disk 2
>>>
>>> Sep 18 09:08:54   INFO [main] md_add_disk(343) /var/lib/sheepdog//disc3,
>>> vdisk nr 1834, total disk 3
>>>
>>> Sep 18 09:08:55 NOTICE [main] get_local_addr(522) found IPv4 address
>>>
>>> Sep 18 09:08:55   INFO [main] send_join_request(1032) IPv4 ip:10.30.20.81
>>> port:7000 going to join the cluster
>>>
>>> Sep 18 09:08:55  ERROR [main] zk_join(975) Previous zookeeper session
>>> exist, shoot myself. Please wait for 30 seconds to join me again.
>>>
>>> Sep 18 09:13:00   INFO [main] md_add_disk(343) /var/lib/sheepdog//disc1,
>>> vdisk nr 1834, total disk 1
>>>
>>> Sep 18 09:13:00   INFO [main] md_add_disk(343) /var/lib/sheepdog//disc2,
>>> vdisk nr 1834, total disk 2
>>>
>>> Sep 18 09:13:00   INFO [main] md_add_disk(343) /var/lib/sheepdog//disc3,
>>> vdisk nr 1834, total disk 3
>>>
>>> Sep 18 09:13:01 NOTICE [main] get_local_addr(522) found IPv4 address
>>>
>>> Sep 18 09:13:01   INFO [main] send_join_request(1032) IPv4 ip:10.30.20.81
>>> port:7000 going to join the cluster
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 59904, off 380928, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 130560, off 448000, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 53248, off 584192, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 60416, off 642048, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 45568, off 706560, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 65536, off 756736, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 29184, off 888832, 0
>>>
>>> Sep 18 09:13:01   INFO [main] replay_journal_entry(161)
>>> /var/lib/sheepdog//disc2/0057e86b00007c02, size 84480, off 923136, 0
>>>
>>>
>>>
>>>
>>> --
>>> sheepdog-users mailing lists
>>> sheepdog-users at lists.wpkg.org
>>> https://lists.wpkg.org/mailman/listinfo/sheepdog-users
>>>


More information about the sheepdog-users mailing list