[Sheepdog] A question about sheepdog's beahvior ...

Wed Oct 27 09:17:49 CEST 2010

Thank you very much for your answers ...
If you need any test on future versions I'm available to make them...

soon,
davide

On 27/10/2010 8.16, MORITA Kazutaka wrote:
> At Tue, 26 Oct 2010 12:39:06 +0200,
> Davide Casale wrote:
>> Hi to all,
>> I've installed Sheepdog Daemon, version 0.1.0 (with corosync 1.2.0 svn
>> rev. 2637) on ubuntu 10.04LTS..
>> The corosync.conf file is (for the useful part) :
>> ---
>> compatibility: whitetank
>> totem {
>>           version: 2
>>           secauth: off
>>           threads: 0
>>           token: 3000
>>           consensus: 5000
>>           interface {
>>                   ringnumber: 0
>>                   bindnetaddr: 192.168.7.x
>>                   mcastaddr: 226.94.1.1
>>                   mcastport: 5405
>>           }
>> }
>> ---
>> I've installed all on three machines with default redundancy (that's 3,
>> it's correct? I launch sheepdog with default /etc/init.d/sheepdog start)..
> Yes, it's default redundancy.
>
>> I've got 20GB of kvm virtual machines ..
>>
>> The questions are :
>>
>> - is it correct that if a single node crash (or I stop with "killall
>> sheep" the sheepdog processes) when I relaunch sheepdog ALL the data
>> are rebuilt from scratch from the other two nodes (each time it restarts
>> from zero bytes to arrive to 20GB) ?
>> I thought that only the changed blocks (4mb each) are resyncronized .... ??
> Yes, it's correct behavior.  Sheepdog cannot detect which objects are
> updated from the previous node membership change, so it is safe to
> receive all objects from the already joined nodes.  However, as you
> say, it's worth considering to optimize it.
>
>> - is it correct that when the syncronization is running on a node, all
>> the others are frozen (and also the kvm virtual machines are frozen)
>> until the syncronization is completed ?
> Yes.  Currently, if a virtual machine accesses to the object which is
> not placed on the right nodes (it could happen because of node
> membership changes), sheepdog stops the access until the object is
> moved to the right node.  But this behavior should be fixed as soon as
> possible, I think.
>
>> And perhaps this is a little bug:
>> if during the syncronization I launch on the node in syncronization the
>> command 'collie node info', the command remain in standby after
>> the first output.. if I stop it with CTRL+C, when the syncronization
>> ended one of the sheep process crash and if I relaunch sheepdog the
>> sycnronization starts again from the beginning (from zero bytes) ...
>>
> The reason 'collie node info' sleeps is same with above.  The problem
> that sheep crashes would be fixed by the following patch.  Thanks for
> your feedback.
>
>
> =
> From: MORITA Kazutaka<morita.kazutaka at lab.ntt.co.jp>
> Subject: [PATCH] sheep: call free_request() after decrementing reference counters
>
> We cannot call free_req() here because client_decref() accesses
> req->ci.
>
> Signed-off-by: MORITA Kazutaka<morita.kazutaka at lab.ntt.co.jp>
> ---
>   sheep/sdnet.c |    7 ++++++-
>   1 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/sheep/sdnet.c b/sheep/sdnet.c
> index 9ad0bc7..6d7e7a3 100644
> --- a/sheep/sdnet.c
> +++ b/sheep/sdnet.c
> @@ -271,12 +271,17 @@ static void free_request(struct request *req)
>
>   static void req_done(struct request *req)
>   {
> +	int dead = 0;
> +
>   	list_add(&req->r_wlist,&req->ci->done_reqs);
>   	if (conn_tx_on(&req->ci->conn)) {
>   		dprintf("connection seems to be dead\n");
> -		free_request(req);
> +		dead = 1;
>   	}
>   	client_decref(req->ci);
> +
> +	if (dead)
> +		free_request(req);
>   }
>
>   static void init_rx_hdr(struct client_info *ci)

-- 

----------------------------------
DAVIDE CASALE
Security Engineer
mailto:casale at shorr-kan.com

SHORR KAN IT ENGINEERING Srl
www.shorr-kan.com
Via Sestriere 28/a
10141 Torino
Phone:  +39 011 382 8358
Fax:    +39 011 384 2028
----------------------------------