[Sheepdog] Sheepdog Read-only issues.

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Tue Apr 12 11:57:16 CEST 2011


At Sun, 10 Apr 2011 10:15:39 -0400,
Eric Renfro wrote:
> 
> I had another issue of it.
> 
> First case test was simple. 2 nodes, 2 copies. Very very basic pacemaker 
> configuration. In fact, the pacemaker configuration was simple. 
> primitive to start the sheepdog servers on each node from lsb:sheepdog, 
> primitive for the libvirt ocf:heartbeat:VirtualDomain, and location 
> constraints for the virtual machine to be based on the node attribute 
> scores of each node.
> 
> Something like:
> 
> node ygg1 \
>          utilization memory="8192" cpu="4" \
>          attributes fw1="100" com="50" vserver="1" sheep="1" standby="off"
> node ygg2 \
>          utilization memory="8192" cpu="4" \
>          attributes fw1="50" com="100" vserver="1" sheep="1"
> primitive sheep lsb:sheepdog
> primitive kvm_vfw1 ocf:heartbeat:VirtualDomain \
>          params config="/etc/libvirt/qemu/vfw1.xml" 
> hypervisor="qemu:///system" migration_transport="ssh" \
>          meta allow-migrate="true" priority="10" target-role="Started" 
> is-managed="true" resource-stickiness="2" migration-threshold="2" \
>          op start interval="0" timeout="120s" \
>          op stop interval="0" timeout="120s" \
>          op migrate_to interval="0" timeout="120s" \
>          op migrate_from interval="0" timeout="120s" \
>          op monitor interval="10" timeout="30" depth="0" \
>          utilization memory="512" cpu="1"
> primitive kvm_com ocf:heartbeat:VirtualDomain \
>          params config="/etc/libvirt/qemu/com.xml" 
> hypervisor="qemu:///system" migration_transport="ssh" \
>          meta allow-migrate="true" priority="5" target-role="Started" 
> is-managed= "true" resource-stickiness="2" migration-threshold="2" \
>          op start interval="0" timeout="120s" \
>          op stop interval="0" timeout="120s" \
>          op migrate_to interval="0" timeout="120s" \
>          op migrate_from interval="0" timeout="120s" \
>          op monitor interval="10" timeout="30" depth="0" \
>          utilization memory="512" cpu="1"
> location com-os-loc kvm_com \
>          rule $id="sheep-loc-rule-0" -inf: not_defined sheep or sheep 
> number:lte 0 \
>          rule $id="sheep-loc-rule-1" sheep: defined sheep
> location com-os-loc kvm_com \
>          rule $id="com-os-loc-rule-0" -inf: not_defined com or com 
> number:lte 0 or not_defined vserver or vserver number:lte 0 \
>          rule $id="com-os-loc-rule-1" com: defined com and defined vserver
> location vfw1-os-loc kvm_vfw1 \
>          rule $id="vfw1-os-loc-rule-0" -inf: not_defined fw1 or fw1 
> number:lte 0 or not_defined vserver or vserver number:lte 0 \
>          rule $id="vfw1-os-loc-rule-1" fw1: defined fw1 and defined vserver
> colocation kvm_com-loc inf: kvm_com sheep:Started
> colocation kvm_mon-loc inf: kvm_mon sheep:Started
> property $id="cib-bootstrap-options" \
>          dc-version="1.1.5-jlkjgjhgfjhf" \
>          cluster-infrastructure="openais" \
>          expected-quorum-votes="2" \
>          stonith-enabled="false" \
>          last-lrm-refresh="1302345517" \
>          placement-strategy="utilization"
> 
> What this setup does in plain English is:
> Run sheepdog only on nodes that have sheep="1" or higher, so that 
> non-storage nodes can also be part of the cluster.
> Run kvm_com (and kvm_vfw1 similarly) on nodes with com="1" or higher, 
> priorities by the value. The alternative number provides failover and 
> failback support seemlessly.
> Make kvm_* also depend on sheepdog being running on the node it's going 
> to be on, else it has to move or shut down.
> 
> Here's an example libvirt domain definition:
> 
> <domain type='kvm'>
> <name>vfw1</name>
> <uuid>64dff97d-fd8f-4c14-9fc3-dbfc6a197ff9</uuid>
> <description>Firewall 1</description>
> <memory>524288</memory>
> <currentMemory>524288</currentMemory>
> <vcpu>1</vcpu>
> <os>
> <type arch='x86_64' machine='pc-0.13'>hvm</type>
> <boot dev='hd'/>
> <boot dev='cdrom'/>
> <bootmenu enable='yes'/>
> </os>
> <features>
> <acpi/>
> <apic/>
> </features>
> <cpu match='exact'>
> <model>Opteron_G3</model>
> <vendor>AMD</vendor>
> <feature policy='require' name='skinit'/>
> <feature policy='require' name='vme'/>
> <feature policy='require' name='mmxext'/>
> <feature policy='require' name='fxsr_opt'/>
> <feature policy='require' name='cr8legacy'/>
> <feature policy='require' name='ht'/>
> <feature policy='require' name='3dnowprefetch'/>
> <feature policy='require' name='3dnowext'/>
> <feature policy='require' name='wdt'/>
> <feature policy='require' name='extapic'/>
> <feature policy='require' name='pdpe1gb'/>
> <feature policy='require' name='osvw'/>
> <feature policy='require' name='cmp_legacy'/>
> <feature policy='require' name='3dnow'/>
> </cpu>
> <clock offset='utc'/>
> <on_poweroff>destroy</on_poweroff>
> <on_reboot>restart</on_reboot>
> <on_crash>restart</on_crash>
> <devices>
> <emulator>/usr/bin/qemu-kvm</emulator>
> <disk type='network' device='disk'>
> <driver name='qemu' type='raw'/>
> <source protocol='sheepdog' name='vfw1'/>
> <target dev='vda' bus='virtio'/>
> <host name='localhost' port='7000'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
> </disk>
> <disk type='file' device='cdrom'>
> <driver name='qemu' type='raw' cache='writeback'/>
> <source file='/vm/iso/OpenSUSE_JeOS64.x86_64-1.0.0.iso'/>
> <target dev='hda' bus='ide'/>
> <readonly/>
> <address type='drive' controller='0' bus='0' unit='0'/>
> </disk>
> <controller type='ide' index='0'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
> </controller>
> <interface type='bridge'>
> <mac address='de:ad:09:e9:01:01'/>
> <source bridge='br0'/>
> <model type='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
> </interface>
> <interface type='bridge'>
> <mac address='de:ad:09:e0:01:01'/>
> <source bridge='br1'/>
> <model type='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
> </interface>
> <input type='tablet' bus='usb'/>
> <input type='mouse' bus='ps2'/>
> <graphics type='vnc' port='5901' autoport='no' listen='0.0.0.0'/>
> <video>
> <model type='cirrus' vram='9216' heads='1'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
> </video>
> <memballoon model='virtio'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
> </memballoon>
> </devices>
> </domain>
> 
> Beyond that, all I needed to test the results was toggling the value of 
> standby between 0 and 1 about 3-4 times max, watching the vm migrate to 
> and from the node and sheepdog corrupted itself.

Thanks for your input!  I'll try to reproduce and fix the problem.

> 
> Secondary scenario, 6 nodes, 3 copies. 4 servers running virtual 
> machines, 2 dedicated storage-only. 2 of my VM servers went down due to 
> a failing APC. vserver2 and vserver4. This happened only once before the 
> problem was fully revealed and total corruption occured with all vdi's 
> wiped out due to missing objects.
> 

When you encounter the same problem next time, could you try the
following command for each alive server:

 $ collie cluster info -a hostname

where hostname is the server name.  I think it would help me to find
out which causes this problem.


Thanks,

Kazutaka



More information about the sheepdog mailing list