[Sheepdog] qemu-img convert slowness and high availability status

krimson krims0n32 at gmail.com
Tue Jun 14 22:06:59 CEST 2011


I need to give a little more info on the cluster join fail problem:

node 1:
# sheep -f /data/sheep
sheep: jrnl_recover(2221) Openning the directory 
/data/sheep/journal/00000003/.
sheep: set_addr(1595) addr = 172.16.1.1, port = 7000
sheep: main(144) Sheepdog daemon (version 0.2.3) started
sheep: get_cluster_status(408) sheepdog is waiting with newer epoch, 1 3 
172.16.1.2:7000 (when I start sheep on node 2)

node 2:
# sheep -f /data/sheep
sheep: jrnl_recover(2221) Openning the directory 
/data/sheep/journal/00000001/.
sheep: jrnl_recover(2226) start jrnl_recovery.
sheep: jrnl_recover(2267) end jrnl_recovery.
sheep: set_addr(1595) addr = 172.16.1.2, port = 7000
sheep: main(144) Sheepdog daemon (version 0.2.3) started
sheep: send_join_request(1048) 33624236 17428
sheep: update_cluster_info(568) failed to join sheepdog, 66

On 06/14/2011 09:35 PM, krimson wrote:
> time qemu-img convert /dev/vmvg/web /home/spja/ff
>
> real    2m12.866s
> user    0m0.680s
> sys    0m11.950s
>
> quite the difference :)
>
> I ran into another issue now, after killing the sheep daemon on one of 
> the clusternodes it seems sheepdog is unable to form a cluster anymore:
> sheep: get_cluster_status(408) sheepdog is waiting with newer epoch, 1 
> 3 172.16.1.2:7000
>
> whereas the other node says "older epoch" instead of "newer epoch". 
> How can I resolve this without losing the VDI's in the sheepstore ?
>
> Thanks !
>
> On 06/14/2011 06:05 PM, MORITA Kazutaka wrote:
>> At Mon, 13 Jun 2011 23:35:32 +0200,
>> krimson wrote:
>>> Hi again :)
>>>
>>> I was converting an LVM based image to sheepdog using:
>>>
>>> # qemu-img convert /dev/vmvg/web sheepdog:web
>>>
>>> and noticed it took 2.5 hours to complete. The image is 8GB in size. 
>>> The
>>> sheepdog/corosync network is a direct Gbit link between two nodes. Any
>>> idea as to why this is taking so long ? Small blocksize perhaps ? The
>>> cluster is using copies=2. Using sheepdog from git (0.2.3) and qemu-kvm
>>> 0.14.
>> How long does it take when you convert the image to a raw image on
>> LVM?
>>   $ time qemu-img convert /dev/vmvg/web /your_store_directory/web.raw
>>
>> This should show an ideal time for converting to Sheepdog images on
>> your machine.
>>
>>> Another question regarding the state of high availability in 
>>> sheepdog, I
>>> noticed it is possible to specify multiple sheepdog hosts in my qemu 
>>> XML
>>> definition, and it defaults to localhost.
>> Sheepdog accepts only one host in the XML format.  See also
>> http://libvirt.org/formatdomain.html#elementsDisks
>>
>>> I would like to avoid
>>> specifying multiple sheepdog hosts in the XML definition if possible 
>>> and
>>> was wondering what happens to the VMs when the sheep daemon dies for
>>> example. The entire node failing would not be a problem I guess as the
>>> remaining node would then startup the VMs that were running on the
>>> failed node.
>> Unfortunately, there is no automatic connection failover support
>> because Sheepdog assumes that VMs connect to localhost (the failure of
>> localhost means that VMs cannot survive any more).
>>
>> We have a plan to remove this restriction in future, but currently
>> Sheepdog cannot handle the failure.
>>
>>
>> Thanks,
>>
>> Kazutaka
>




More information about the sheepdog mailing list