[sheepdog] [PATCH 1/2] test: add a test for sockfd keepalive
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Mon Sep 3 13:15:07 CEST 2012
At Mon, 27 Aug 2012 17:32:33 +0800,
Liu Yuan wrote:
>
> From: Liu Yuan <tailai.ly at taobao.com>
>
> Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
> ---
> tests/035 | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/035.out | 42 ++++++++++++++++++++++++++++++++++++++++++
> tests/group | 1 +
> 3 files changed, 97 insertions(+)
> create mode 100755 tests/035
> create mode 100644 tests/035.out
>
> diff --git a/tests/035 b/tests/035
> new file mode 100755
> index 0000000..501f959
> --- /dev/null
> +++ b/tests/035
> @@ -0,0 +1,54 @@
> +#!/bin/bash
> +
> +# Test sockfd keepalive
> +
> +seq=`basename $0`
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +
> +trap "_uninit; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +_uninit()
> +{
> + iptables -D INPUT -p tcp --sport 7001 -j DROP
> + iptables -D INPUT -p tcp --dport 7001 -j DROP
> +}
> +
> +_cleanup
> +
> +for i in `seq 0 1 2`; do
> + _start_sheep $i
> +done
> +
> +_wait_for_sheep 3
> +
> +$COLLIE cluster format -c 3 -m unsafe
> +
> +$COLLIE vdi create test 40M
> +(
> +dd if=/dev/urandom | $COLLIE vdi write test
> +) &
> +
> +sleep 3
> +# Simulate machine(127.0.0.1:7001) down
> +iptables -A INPUT -p tcp --sport 7001 -j DROP
> +iptables -A INPUT -p tcp --dport 7001 -j DROP
> +
> +sleep 1
> +# Trigger the confchg
> +_kill_sheep 1
> +
> +_wait_for_collie
> +
> +for i in `seq 0 9`; do
> + $COLLIE vdi object -i $i test
> +done
> +
> +status=0
> diff --git a/tests/035.out b/tests/035.out
> new file mode 100644
> index 0000000..0c55d7e
> --- /dev/null
> +++ b/tests/035.out
> @@ -0,0 +1,42 @@
> +QA output created by 035
> +using backend farm store
> +Looking for the object 0x7c2b2500000000 (the inode vid 0x7c2b25 idx 0) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000001 (the inode vid 0x7c2b25 idx 1) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000002 (the inode vid 0x7c2b25 idx 2) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000003 (the inode vid 0x7c2b25 idx 3) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000004 (the inode vid 0x7c2b25 idx 4) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000005 (the inode vid 0x7c2b25 idx 5) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000006 (the inode vid 0x7c2b25 idx 6) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000007 (the inode vid 0x7c2b25 idx 7) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000008 (the inode vid 0x7c2b25 idx 8) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> +Looking for the object 0x7c2b2500000009 (the inode vid 0x7c2b25 idx 9) with 2 nodes
> +
> +127.0.0.1:7000 has the object (should be 2 copies)
> +127.0.0.1:7002 has the object (should be 2 copies)
> diff --git a/tests/group b/tests/group
> index d20de40..1dafad4 100644
> --- a/tests/group
> +++ b/tests/group
> @@ -46,3 +46,4 @@
> 032 auto quick store
> 033 auto quick store
> 034 auto quick store
> +035 auto quick cluster
I found that this script takes a lot of time (about 15 minutes)
occasionally. Perhaps, TCP keepalive is not working in some
situations? This problem is highly reproducible on my environment
with the following script.
$ while test "$?" -eq 0; do ./check 35 -corosync; done
I wonder if we should dig into this problem. Can we close all
connections when epoch is incremented? I think you tried it but gave
up before.
http://www.mail-archive.com/sheepdog@lists.wpkg.org/msg04524.html
Is it still difficult to implement the approach with the current
Sheepdog?
Thanks,
Kazutaka
More information about the sheepdog
mailing list