[Sheepdog] Sheepdog and corosync

Mon Aug 22 08:15:57 CEST 2011

At Tue, 16 Aug 2011 21:04:07 +0100,
Brian Candler wrote:
> 
> I am in the process of getting a trivial (1-node) sheepdog running under
> Ubuntu 11.04 x86_64.
> 
> I have the corosync package installed, copied corosync.conf.example to
> corosync.conf and set a valid bindnetaddr. It appears to start - these
> messages appear in /var/log/syslog
> 
> ~~~~
> Aug 16 20:31:37 x100 corosync[15772]:   [MAIN  ] Corosync Cluster Engine ('1.2.1'): started and ready to provide service.
> Aug 16 20:31:37 x100 corosync[15772]:   [MAIN  ] Corosync built-in features: nss
> Aug 16 20:31:37 x100 corosync[15772]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
> Aug 16 20:31:37 x100 corosync[15772]:   [TOTEM ] Initializing transport (UDP/IP).
> Aug 16 20:31:37 x100 corosync[15772]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Aug 16 20:31:37 x100 corosync[15772]:   [TOTEM ] The network interface [192.168.122.1] is now up.
> Aug 16 20:31:37 x100 corosync[15772]:   [SERV  ] Service engine loaded: corosync extended virtual synchrony service
> Aug 16 20:31:37 x100 corosync[15772]:   [SERV  ] Service engine loaded: corosync configuration service
> Aug 16 20:31:37 x100 corosync[15772]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
> Aug 16 20:31:37 x100 corosync[15772]:   [SERV  ] Service engine loaded: corosync cluster config database access v1.01
> Aug 16 20:31:37 x100 corosync[15772]:   [SERV  ] Service engine loaded: corosync profile loading service
> Aug 16 20:31:37 x100 corosync[15772]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
> Aug 16 20:31:37 x100 corosync[15772]:   [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
> Aug 16 20:31:37 x100 corosync[15772]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
> Aug 16 20:31:37 x100 corosync[15772]:   [MAIN  ] Completed service synchronization, ready to provide service.
> ~~~~
> 
> However, when I try to run sheep, I get the following:
> 
> ~~~~
> $ sheep -f /var/tmp/sheep
> sheep: jrnl_recover(2305) Openning the directory /var/tmp/sheep/journal/00000000/.
> sheep: create_cluster(1709) Failed to initialize cpg, 100
> sheep: create_cluster(1710) Is corosync running?
> sheep: main(150) failed to create sheepdog cluster.
> ~~~~
> 
> And each time I do this, I get the following message in /var/log/syslog:
> 
>     Aug 16 20:32:46 x100 corosync[15772]:   [IPC   ] Invalid IPC credentials.
> 
> This suggests to me some sort of authentication issue between sheep and
> corosync.
> 
> The usage example at https://github.com/collie/sheepdog/wiki/Getting-Started
> seems to show sheep being run as a regular user, not root.  But I tried
> running as root anyway, and it seemed to work this time:
> 
> ~~~~
> $ sudo sheep -f /var/tmp/sheep
> sheep: jrnl_recover(2305) Openning the directory /var/tmp/sheep/journal/00000000/.
> sheep: set_addr(1696) addr = 192.168.122.1, port = 7000
> sheep: main(154) Sheepdog daemon (version 0.2.3) started
> sheep: read_epoch(2099) failed to read epoch 0
> ~~~~
> 
> OK, so let's go with that for now (although I'd prefer not to run as root)
> 
> ~~~~
> $ collie cluster format --copies=2
> $ collie node list
>    Idx - Host:Port          Vnodes   Zone
> -----------------------------------------
> *    0 - 192.168.122.1:7000  	64	0
> $ qemu-img create sheepdog:Test 2G
> Formatting 'sheepdog:Test', fmt=raw size=2147483648 
> qemu-img: Failed to write the requested VDI, Test
> 
> qemu-img: sheepdog:Test: error while creating raw: Input/output error
> ~~~~
> 
> Hmm, that's not so good. The sheep process says:
> 
> ~~~~
> sheep: cluster_queue_request(266) 0x7f6c891f4010 84
> sheep: attr(1928) use 'user_xattr' option?, user.sheepdog.copies
> sheep: __sd_deliver_done(925) unknown message 2
> sheep: cluster_queue_request(266) 0x10d3130 82
> sheep: cluster_queue_request(266) 0x10d3130 11
> sheep: do_lookup_vdi(236) looking for Test 4, ec9f05
> sheep: add_vdi(333) we create a new vdi, 0 Test (4) 2147483648, vid: ec9f05, base 0, cur 0 
> sheep: add_vdi(337) qemu doesn't specify the copies... 2
> sheep: store_queue_request_local(628) use 'user_xattr' option?
> sheep: write_object(647) fail 80ec9f0500000000 6
> sheep: __sd_deliver_done(925) unknown message 2
> ~~~~
> 
> Maybe I need to set copies=1 for a degraded cluster?
> 
> ~~~~
> $ collie cluster format --copies=1
> $ collie node list
>    Idx - Host:Port          Vnodes   Zone
> -----------------------------------------
> *    0 - 192.168.122.1:7000  	64	0
> brian at x100:/etc/corosync$ qemu-img create sheepdog:Test 2G
> Formatting 'sheepdog:Test', fmt=raw size=2147483648 
> qemu-img: Failed to write the requested VDI, Test
> 
> qemu-img: sheepdog:Test: error while creating raw: Input/output error
> ~~~~
> 
> Same result:
> 
> ~~~~
> sheep: cluster_queue_request(266) 0x10d3130 84
> sheep: attr(1928) use 'user_xattr' option?, user.sheepdog.copies
> sheep: __sd_deliver_done(925) unknown message 2
> sheep: cluster_queue_request(266) 0x10d3130 82
> sheep: cluster_queue_request(266) 0x10d3130 11
> sheep: do_lookup_vdi(236) looking for Test 4, ec9f05
> sheep: add_vdi(333) we create a new vdi, 0 Test (4) 2147483648, vid: ec9f05, base 0, cur 0 
> sheep: add_vdi(337) qemu doesn't specify the copies... 1
> sheep: store_queue_request_local(628) use 'user_xattr' option?
> sheep: write_object(647) fail 80ec9f0500000000 6
> sheep: __sd_deliver_done(925) unknown message 2
> ~~~~
> 
> I notice the message about "user_xattr" option. However this filesystem
> is ext4:
> 
>     $ mount | grep "on / "
>     /dev/sda5 on / type ext4 (rw,errors=remount-ro,commit=0)
> 
> and the Getting-Started guide says that user_xattr is only needed for ext3.
> However, let's try it anyway:
> 
>     $ sudo mount -o remount,user_xattr /
> 
> OK, that seems to work! Sheep shows:
> 
> ~~~~
> sheep: cluster_queue_request(266) 0x10d3130 11
> sheep: do_lookup_vdi(236) looking for Test 4, ec9f05
> sheep: add_vdi(333) we create a new vdi, 0 Test (4) 2147483648, vid: ec9f05, base 0, cur 0 
> sheep: add_vdi(337) qemu doesn't specify the copies... 1
> sheep: vdi_op_done(758) done 0 15507205
> sheep: __sd_deliver_done(925) unknown message 2
> ~~~~
> 
> and I can boot with
> 
>     $ qemu-system-x86_64 -cdrom /v/downloads/linux/ubuntu-10.04.3-server-amd64.iso sheepdog:Test 
> 
> So it looks like I have a one-node cluster:
> 
> ~~~~
> # collie cluster info
> Cluster status: running
> 
> Creation time        Epoch Nodes
> 1970-01-01 01:00:00      1 [192.168.122.1:7000]
> ~~~~~
> 
> Anyway, my questions are:
> 
> 1. Can I run sheep as a non-root user? If so, how?
> 
> 2. Do I really need user_xattr even for ext4? (if so, the documentation
>    needs adjusting)

For Sheepdog, the underlying filesystem needs to support an extended
attribute.  I'm not familiar with ext4, but the answer is probably
yes.

> 
> 3. Can I ignore the cluster creation time of '1970-01-01 01:00:00' ?

If your filesystem supports an extended attribute, you can set the
correct creation time.

> 
> 4. What happens if you set --copies=N but the cluster degrades to
>    the point where it has fewer nodes than that? As far as I can see,
>    my one-node cluster with --copies=2 does actually work. Would the
>    data get copied when a new node is added?

Even if the number of nodes is fewer than N, Sheepdog can work.  When
you add a new machine, the copies of data will be increased up to N.

> 
> One other point. Experimentation shows that "collie cluster format"
> instantly destroys all existing vdis, with no confirmation - and it can be
> run as a non-root user.  Can I suggest some idiot-proofing is done on this?
> e.g.  if a cluster already exists then you need to add some extra parameter
> to force deletion?

Good point.  I think we should add something like a "--force" option.

E.g.

 $ collie cluster format          # succeed only when the store directory is empty
 $ collie cluster format --force  # succeed always

Thanks,

Kazutaka