[Sheepdog] sheepdog Digest, Vol 19, Issue 14

Sat Apr 30 19:57:43 CEST 2011

I am also having similar issues making one sheep node connect to another one.  

I am still testing with nodes inside VMWare.  I have tried NAT mode vs bridged 
set up to see if it makes any diffrence, which it seams to make none.  I have 
set up three different nodes.  Some times the nodes are able to see each other, 
but often upon boot none of the nodes are able to see each other.  Is there a 
specified way to make a node to connect to the rest of the cluster?  Best way to 
shut down a node for maintenance?  

----------------------------------------------------------------------

Message: 1
Date: Fri, 29 Apr 2011 17:15:43 +0200
From: "S. Bonnegent" <sebastien.bonnegent at gmail.com>
To: sheepdog at lists.wpkg.org
Subject: [Sheepdog] sheepdog on ubuntu 11.04
Message-ID: <4DBAD61F.90204 at gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I try to test Sheepdog on Ubuntu 11.04 64 bits with 2 hosts but I have
some troubles to debug the situation.

I use 2 nodes (PC1 and PC2) with default sheepdog package
(0.2.2-0ubuntu1). I added "user_xattr" (my partition is in ext4) in
/etc/fstab and configured  /etc/corosync/corosync.conf (file is below).

PC1 is 172.29.22.74
PC2 is 172.29.22.78

On PC1, I start sheepdog with "service sheepdog start" and I have:

# collie node list
  Idx   Node id (FNV-1a) - Host:Port
------------------------------------------------
* 0     52b76e70de45e6c8 - 172.29.22.74:7000

# corosync-cpgtool
Group Name             PID         Node ID
sheepdog
                      1070      1242963372 (172.29.22.74)

On PC2, I start sheepdog too and I obtain:
# collie node list
The node had failed to join sheepdog
failed to get node list

# corosync-cpgtool
Group Name             PID         Node ID
sheepdog
                      1070      1242963372 (172.29.22.74)
                      1064      1310072236 (172.29.22.78)

And now, on PC1 I have:
# collie node list
The node had failed to join sheepdog
failed to get node list

# corosync-cpgtool
Group Name             PID         Node ID
sheepdog
                      1070      1242963372 (172.29.22.74)
                      1064      1310072236 (172.29.22.78)

In PC1 logs, there are this message:

get_cluster_status(362) joining node has invalid ctime, 5960354062892656328

but PC1 and PC2 use NTP and have exactly same date and time.
Do you know why sheepdog can't start ?

Thank you.

Note: my /etc/corosync/corosync.conf
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 172.29.0.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
logging {
  fileline: off
  to_stderr: no
  to_logfile: yes
  to_syslog: yes
  logfile: /var/log/corosync.log
  debug: off
  timestamp: on
  logger_subsys {
   subsys: AMF
   debug: off
  }
}
amf {
  mode: disabled
}

-------------- next part --------------
A non-text attachment was scrubbed...
Name: last_logs_on_pc1.log
Type: text/x-log
Size: 2108 bytes
Desc: not available
URL: 
<http://lists.wpkg.org/pipermail/sheepdog/attachments/20110429/4cb26be6/attachment-0001.bin>

------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20110430/9c63b95a/attachment-0002.html>