[Sheepdog] sheepdog Digest, Vol 19, Issue 14

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Tue May 3 23:18:32 CEST 2011


At Sat, 30 Apr 2011 10:57:43 -0700 (PDT),
Ski Mountain wrote:
> 
> I am also having similar issues making one sheep node connect to another one.  
> 
> I am still testing with nodes inside VMWare.  I have tried NAT mode vs bridged 
> set up to see if it makes any diffrence, which it seams to make none.  I have 
> set up three different nodes.  Some times the nodes are able to see each other, 
> but often upon boot none of the nodes are able to see each other.  Is there a 
> specified way to make a node to connect to the rest of the cluster?  Best way to 
> shut down a node for maintenance?  

Could you give me the output of 'collie cluster info' on each machine
when your Sheepdog cluster doesn't work?

To shutdown Sheepdog safely, please run 'collie cluster shutdown' on
the one of your machines.

Thanks,

Kazutaka

> 
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 29 Apr 2011 17:15:43 +0200
> From: "S. Bonnegent" <sebastien.bonnegent at gmail.com>
> To: sheepdog at lists.wpkg.org
> Subject: [Sheepdog] sheepdog on ubuntu 11.04
> Message-ID: <4DBAD61F.90204 at gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi,
> 
> I try to test Sheepdog on Ubuntu 11.04 64 bits with 2 hosts but I have
> some troubles to debug the situation.
> 
> I use 2 nodes (PC1 and PC2) with default sheepdog package
> (0.2.2-0ubuntu1). I added "user_xattr" (my partition is in ext4) in
> /etc/fstab and configured  /etc/corosync/corosync.conf (file is below).
> 
> PC1 is 172.29.22.74
> PC2 is 172.29.22.78
> 
> On PC1, I start sheepdog with "service sheepdog start" and I have:
> 
> # collie node list
>   Idx   Node id (FNV-1a) - Host:Port
> ------------------------------------------------
> * 0     52b76e70de45e6c8 - 172.29.22.74:7000
> 
> # corosync-cpgtool
> Group Name             PID         Node ID
> sheepdog
>                       1070      1242963372 (172.29.22.74)
> 
> 
> On PC2, I start sheepdog too and I obtain:
> # collie node list
> The node had failed to join sheepdog
> failed to get node list
> 
> # corosync-cpgtool
> Group Name             PID         Node ID
> sheepdog
>                       1070      1242963372 (172.29.22.74)
>                       1064      1310072236 (172.29.22.78)
> 
> And now, on PC1 I have:
> # collie node list
> The node had failed to join sheepdog
> failed to get node list
> 
> # corosync-cpgtool
> Group Name             PID         Node ID
> sheepdog
>                       1070      1242963372 (172.29.22.74)
>                       1064      1310072236 (172.29.22.78)
> 
> In PC1 logs, there are this message:
> 
> get_cluster_status(362) joining node has invalid ctime, 5960354062892656328
> 
> but PC1 and PC2 use NTP and have exactly same date and time.
> Do you know why sheepdog can't start ?
> 
> 
> Thank you.
> 
> 
> 
> Note: my /etc/corosync/corosync.conf
> compatibility: whitetank
> totem {
> version: 2
> secauth: off
> threads: 0
> interface {
> ringnumber: 0
> bindnetaddr: 172.29.0.0
> mcastaddr: 226.94.1.1
> mcastport: 5405
> }
> }
> logging {
>   fileline: off
>   to_stderr: no
>   to_logfile: yes
>   to_syslog: yes
>   logfile: /var/log/corosync.log
>   debug: off
>   timestamp: on
>   logger_subsys {
>    subsys: AMF
>    debug: off
>   }
> }
> amf {
>   mode: disabled
> }
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: last_logs_on_pc1.log
> Type: text/x-log
> Size: 2108 bytes
> Desc: not available
> URL: 
> <http://lists.wpkg.org/pipermail/sheepdog/attachments/20110429/4cb26be6/attachment-0001.bin>
> 
> 
> ------------------------------
> [1.2  <text/html; us-ascii (7bit)>]
> 
> [2  <text/plain; us-ascii (7bit)>]
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog



More information about the sheepdog mailing list