[Sheepdog] Small sheepdog howto

Piavlo piavka at cs.bgu.ac.il
Mon Jan 25 16:33:06 CET 2010


Here is my corosync.conf - where X.X.X.0 = my subnet
------------------------------------------------------
totem {
        version: 2
        secauth: on
        threads: 2
        token: 1000
        consensus: 1300
        interface {
                ringnumber: 0
                bindnetaddr: X.X.X.0
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}
------------------------------------------------------






PCextreme B.V. - Wido den Hollander wrote:
> Hi,
>
> I've just setup Sheepdog on four Ubuntu 9.10 AMD64 machines and wanted
> to start testing it.
>
> But i can't get the nodes to work ok, this is what i have done:
>
> * Compiled sheepdog in /opt/sheepdog
> * Compiled the patched qemu-kvm in /opt/qemu-kvm
> * Configured corosync with:
>
>         interface {
>                 ringnumber: 0
>                 bindnetaddr: 192.168.6.0
>                 mcastaddr: 226.94.1.1
>                 mcastport: 5405
>         }
>
> My subnet here is 192.168.6.0/24
>
> But when i do "shepherd info -t dog" i get:
>
> root at wido-desktop:~# shepherd info -t dog
>   Idx	Node id (FNV-1a)    - Host:Port
> --------------------------------------------------
>   0	4c212f70289e9103 - 127.0.1.1:7000
>   1	4c212f70289e9103 - 127.0.1.1:7000
>   2	4c212f70289e9103 - 127.0.1.1:7000
>   3	4c212f70289e9103 - 127.0.1.1:7000
> * 4	4c212f70289e9103 - 127.0.1.1:7000
>   5	4c212f70289e9103 - 127.0.1.1:7000
>   6	4c212f70289e9103 - 127.0.1.1:7000
>   7	4c212f70289e9103 - 127.0.1.1:7000
>   8	4c212f70289e9103 - 127.0.1.1:7000
> root at wido-desktop:~#
>
> All the nodes claim to have the same nodeid? I have used
> "start-sheepdog" to start the "collie" daemon.
>
> On my nodes it looks like:
>
> root at wido-desktop:~# ps aux|grep collie
> root      4492  0.0  0.0  10284   592 ?        Ss   15:11
> 0:00 /usr/src/sheepdog/collie/collie --port 7000 /srv/sheepdog/0 -d
> root      4493  0.0  0.2  65752 18132 ?        Ssl  15:11
> 0:00 /usr/src/sheepdog/collie/collie --port 7000 /srv/sheepdog/0 -d
> root      4730  0.0  0.0   7336   888 pts/0    S+   16:18   0:00 grep
> collie
> root at wido-desktop:~#
>
> So they see eachother, but all bind to 127.0.1.1:7000? I think i am
> missing something here.
>
> Would somebody be so kind to share a corosync.conf for review? I think i
> am making a mistake there.
>
> When i do:
>
> shepherd mkfs --copies=3
>
> It doesn't return any output.
>
> collie segfaults with:
> [12279.751092] collie[4812]: segfault at 3 ip 0000000000000003 sp
> 00007f34d5945fb8 error 14 in collie[400000+e000]
>
> When i run collie on forground mode it doesn't segfault, and gives the
> following output:
>
> collie: worker_routine(175) started this thread 0
> collie: worker_routine(175) started this thread 0
> collie: worker_routine(175) started this thread 0
> collie: worker_routine(175) started this thread 0
> collie: sd_confch(653) confchg nodeid 1cb8d44
> collie: sd_confch(655) 1 0 1
> collie: sd_confch(659) [0] node_id: 30117188, pid: 4853, reason:
> -1138069504
> collie: sd_deliver(532) op: 1, done: 0, size: 41056, from:
> 127.0.1.1:7000
> collie: __sd_deliver(468) op: 1, done: 0, size: 41056, from:
> 127.0.1.1:7000
> collie: sd_deliver(532) op: 1, done: 1, size: 41056, from:
> 127.0.1.1:7000
> collie: __sd_deliver(468) op: 1, done: 1, size: 41056, from:
> 127.0.1.1:7000
> collie: print_node_list(244) l nodeid: 1cb8d44, pid: 4853, ip:
> 127.0.1.1:7000
> collie: listen_handler(336) accepted a new connection, 8
> collie: cluster_queue_request(177) 0x1abf940 19
> collie: client_handler(297) closed a connection, 8
> collie: listen_handler(336) accepted a new connection, 8
> collie: cluster_queue_request(177) 0x1abf850 21
> collie: sd_deliver(532) op: 2, done: 0, size: 136, from: 127.0.1.1:7000
> collie: __sd_deliver(468) op: 2, done: 0, size: 136, from:
> 127.0.1.1:7000
> collie: listen_handler(336) accepted a new connection, 10
> collie: client_handler(297) closed a connection, 10
> collie: listen_handler(336) accepted a new connection, 10
> collie: client_handler(297) closed a connection, 10
> collie: listen_handler(336) accepted a new connection, 10
> collie: client_handler(297) closed a connection, 10
> collie: sd_deliver(532) op: 2, done: 1, size: 136, from: 127.0.1.1:7000
> collie: __sd_deliver(468) op: 2, done: 1, size: 136, from:
> 127.0.1.1:7000
> collie: vdi_op_done(438) 3
> collie: client_handler(297) closed a connection, 8
>
> This morning i had sheepdog running on one node and it worked fine, i
> had a KVM VM running on top of it without any troubles.
>
> I made a checkout of the GIT repository this morning, so i am using the
> latest version.
>
> Any idea?
>
>
>   




More information about the sheepdog mailing list