2013/2/12 icez network <icez at icez.net>: > How you start the cluster back? The recommended procedure is to start 2 > node, then 'collie cluster recover force' and wait until cluster recover > completed and then start the "new" 3rd node. I did the wrong procedure then: run sheep on the "new" node 3 run the recover I tried now to use the recommended prcedure. This time I used 4 nodes. Cluster shutdwon. Removed node 4. Run cluster recover with the 3 nodes on line. ('collie node recovery' shows nothing) I didn't add node 4 back. Status of the cluster is running. Now I run a guest on node 1; it starts but gives I/O errors. Here is the portion of sheep.log (node 1): ... Feb 13 10:37:24 [rw 464] recover_object_work(201) done:440 count:444, oid:a34c67000004d6 Feb 13 10:37:24 [rw 465] recover_object_work(201) done:441 count:444, oid:a34c67000005a0 Feb 13 10:37:24 [rw 466] recover_object_work(201) done:442 count:444, oid:a34c670000048d Feb 13 10:37:24 [rw 467] recover_object_work(201) done:443 count:444, oid:a34c67000004ec Feb 13 10:37:24 [main] queue_cluster_request(307) COMPLETE_RECOVERY (0x2300bb0) Feb 13 10:38:13 [io 471] do_lookup_vdi(373) looking for squeeze (a34c67) Feb 13 10:38:33 [gway 10025] wait_forward_request(206) fail 2 Feb 13 10:38:33 [gway 10026] wait_forward_request(206) fail 2 Feb 13 10:38:33 [gway 10027] wait_forward_request(206) fail 2 Feb 13 10:38:38 [gway 10436] wait_forward_request(206) fail 2 Feb 13 10:38:38 [gway 10438] wait_forward_request(206) fail 2 Feb 13 10:38:39 [gway 10457] wait_forward_request(206) fail 2 ... I run vdi check (I have only 1 vdi for testing), and after it, the guest doesn't report I/O errors. At this point, adding the 4th node is flowless. |