[Sheepdog] Small sheepdog howto
Piavlo
piavka at cs.bgu.ac.il
Mon Jan 25 16:33:06 CET 2010
Here is my corosync.conf - where X.X.X.0 = my subnet
------------------------------------------------------
totem {
version: 2
secauth: on
threads: 2
token: 1000
consensus: 1300
interface {
ringnumber: 0
bindnetaddr: X.X.X.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
logging {
fileline: off
to_stderr: yes
to_logfile: yes
to_syslog: yes
logfile: /var/log/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
------------------------------------------------------
PCextreme B.V. - Wido den Hollander wrote:
> Hi,
>
> I've just setup Sheepdog on four Ubuntu 9.10 AMD64 machines and wanted
> to start testing it.
>
> But i can't get the nodes to work ok, this is what i have done:
>
> * Compiled sheepdog in /opt/sheepdog
> * Compiled the patched qemu-kvm in /opt/qemu-kvm
> * Configured corosync with:
>
> interface {
> ringnumber: 0
> bindnetaddr: 192.168.6.0
> mcastaddr: 226.94.1.1
> mcastport: 5405
> }
>
> My subnet here is 192.168.6.0/24
>
> But when i do "shepherd info -t dog" i get:
>
> root at wido-desktop:~# shepherd info -t dog
> Idx Node id (FNV-1a) - Host:Port
> --------------------------------------------------
> 0 4c212f70289e9103 - 127.0.1.1:7000
> 1 4c212f70289e9103 - 127.0.1.1:7000
> 2 4c212f70289e9103 - 127.0.1.1:7000
> 3 4c212f70289e9103 - 127.0.1.1:7000
> * 4 4c212f70289e9103 - 127.0.1.1:7000
> 5 4c212f70289e9103 - 127.0.1.1:7000
> 6 4c212f70289e9103 - 127.0.1.1:7000
> 7 4c212f70289e9103 - 127.0.1.1:7000
> 8 4c212f70289e9103 - 127.0.1.1:7000
> root at wido-desktop:~#
>
> All the nodes claim to have the same nodeid? I have used
> "start-sheepdog" to start the "collie" daemon.
>
> On my nodes it looks like:
>
> root at wido-desktop:~# ps aux|grep collie
> root 4492 0.0 0.0 10284 592 ? Ss 15:11
> 0:00 /usr/src/sheepdog/collie/collie --port 7000 /srv/sheepdog/0 -d
> root 4493 0.0 0.2 65752 18132 ? Ssl 15:11
> 0:00 /usr/src/sheepdog/collie/collie --port 7000 /srv/sheepdog/0 -d
> root 4730 0.0 0.0 7336 888 pts/0 S+ 16:18 0:00 grep
> collie
> root at wido-desktop:~#
>
> So they see eachother, but all bind to 127.0.1.1:7000? I think i am
> missing something here.
>
> Would somebody be so kind to share a corosync.conf for review? I think i
> am making a mistake there.
>
> When i do:
>
> shepherd mkfs --copies=3
>
> It doesn't return any output.
>
> collie segfaults with:
> [12279.751092] collie[4812]: segfault at 3 ip 0000000000000003 sp
> 00007f34d5945fb8 error 14 in collie[400000+e000]
>
> When i run collie on forground mode it doesn't segfault, and gives the
> following output:
>
> collie: worker_routine(175) started this thread 0
> collie: worker_routine(175) started this thread 0
> collie: worker_routine(175) started this thread 0
> collie: worker_routine(175) started this thread 0
> collie: sd_confch(653) confchg nodeid 1cb8d44
> collie: sd_confch(655) 1 0 1
> collie: sd_confch(659) [0] node_id: 30117188, pid: 4853, reason:
> -1138069504
> collie: sd_deliver(532) op: 1, done: 0, size: 41056, from:
> 127.0.1.1:7000
> collie: __sd_deliver(468) op: 1, done: 0, size: 41056, from:
> 127.0.1.1:7000
> collie: sd_deliver(532) op: 1, done: 1, size: 41056, from:
> 127.0.1.1:7000
> collie: __sd_deliver(468) op: 1, done: 1, size: 41056, from:
> 127.0.1.1:7000
> collie: print_node_list(244) l nodeid: 1cb8d44, pid: 4853, ip:
> 127.0.1.1:7000
> collie: listen_handler(336) accepted a new connection, 8
> collie: cluster_queue_request(177) 0x1abf940 19
> collie: client_handler(297) closed a connection, 8
> collie: listen_handler(336) accepted a new connection, 8
> collie: cluster_queue_request(177) 0x1abf850 21
> collie: sd_deliver(532) op: 2, done: 0, size: 136, from: 127.0.1.1:7000
> collie: __sd_deliver(468) op: 2, done: 0, size: 136, from:
> 127.0.1.1:7000
> collie: listen_handler(336) accepted a new connection, 10
> collie: client_handler(297) closed a connection, 10
> collie: listen_handler(336) accepted a new connection, 10
> collie: client_handler(297) closed a connection, 10
> collie: listen_handler(336) accepted a new connection, 10
> collie: client_handler(297) closed a connection, 10
> collie: sd_deliver(532) op: 2, done: 1, size: 136, from: 127.0.1.1:7000
> collie: __sd_deliver(468) op: 2, done: 1, size: 136, from:
> 127.0.1.1:7000
> collie: vdi_op_done(438) 3
> collie: client_handler(297) closed a connection, 8
>
> This morning i had sheepdog running on one node and it worked fine, i
> had a KVM VM running on top of it without any troubles.
>
> I made a checkout of the GIT repository this morning, so i am using the
> latest version.
>
> Any idea?
>
>
>
More information about the sheepdog
mailing list