[Sheepdog] Collie cluster info 1970-01-01 01:00:00

Valerio Pachera sirio81 at gmail.com
Mon Aug 29 12:19:36 CEST 2011


2011/8/27 MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>:
> Hmm, it looks there is no error.  What happens if you run the laptop3
> first, and run the laptop1 and laptop2 next?  The two node can join
> Sheepdog correctly?

No.

I run 'collie cluster shutwdown' from laptop2 (the last that was still on).
I stopped corosync on all 3 nodes.
I started corosync and sheep on node3 and it gave me the same message.

I run corosync and sheep on node2 and node1.
On node2 I get no outptup from 'sheep cluster info'.

>From node1 I get
# collie cluster info
Waiting for a format operation
Ctime                Epoch Nodes
1970-01-01 01:00:00      0 []



Node1 has lots of message in /mnt/sheepdog/sheepdog.log list these:
...
Aug 29 12:09:47 find_tgt_node(1107) 26, 64, 54, 128, 0
Aug 29 12:09:47 find_tgt_node(1146) 26, 0, 54
Aug 29 12:09:47 __recover_one(1277) rename
/mnt/sheepdog//obj/00000004/00a34c67000008a6.tmp to
/mnt/sheepdog//obj/00000004/00a34c67000008a6
Aug 29 12:09:47 __recover_one(1283) recovered oid a34c67000008a6 to epoch 4
Aug 29 12:09:47 recover_one(1340) 546 1140,   a34c6700000067
Aug 29 12:09:47 ob_open(491) failed to open
/mnt/sheepdog//obj/00000004/00a34c6700000067, No such file or
directory
Aug 29 12:09:47 recover_one(1397) 54, 2, 0
Aug 29 12:09:47 __recover_one(1176) recover obj a34c6700000067 from epoch 3
Aug 29 12:09:47 find_tgt_node(1107) 26, 64, 54, 128, 0
Aug 29 12:09:47 find_tgt_node(1146) 26, 0, 54
Aug 29 12:09:47 __recover_one(1277) rename
/mnt/sheepdog//obj/00000004/00a34c6700000067.tmp to
/mnt/sheepdog//obj/00000004/00a34c6700000067
Aug 29 12:09:47 __recover_one(1283) recovered oid a34c6700000067 to epoch 4
Aug 29 12:09:47 recover_one(1340) 547 1140,   a34c67000008a3
Aug 29 12:09:47 ob_open(491) failed to open
/mnt/sheepdog//obj/00000004/00a34c67000008a3, No such file or
directory
Aug 29 12:09:47 recover_one(1397) 54, 2, 0
Aug 29 12:09:47 __recover_one(1176) recover obj a34c67000008a3 from epoch 3
Aug 29 12:09:47 find_tgt_node(1107) 26, 64, 54, 128, 0
Aug 29 12:09:47 find_tgt_node(1146) 26, 0, 54
Aug 29 12:09:47 __recover_one(1277) rename
/mnt/sheepdog//obj/00000004/00a34c67000008a3.tmp to
/mnt/sheepdog//obj/00000004/00a34c67000008a3
Aug 29 12:09:47 __recover_one(1283) recovered oid a34c67000008a3 to epoch 4
Aug 29 12:09:47 recover_one(1340) 548 1140,   a34c670000086e
Aug 29 12:09:47 ob_open(491) failed to open
/mnt/sheepdog//obj/00000004/00a34c670000086e, No such file or
directory
Aug 29 12:09:47 recover_one(1397) 54, 2, 0
Aug 29 12:09:47 __recover_one(1176) recover obj a34c670000086e from epoch 3
Aug 29 12:09:47 find_tgt_node(1107) 26, 64, 54, 128, 0
Aug 29 12:09:47 find_tgt_node(1146) 26, 0, 54
Aug 29 12:09:48 __recover_one(1277) rename
/mnt/sheepdog//obj/00000004/00a34c670000086e.tmp to
/mnt/sheepdog//obj/00000004/00a34c670000086e
Aug 29 12:09:48 __recover_one(1283) recovered oid a34c670000086e to epoch 4
Aug 29 12:09:48 recover_one(1340) 549 1140,   a34c67000007e6
Aug 29 12:09:48 ob_open(491) failed to open
/mnt/sheepdog//obj/00000004/00a34c67000007e6, No such file or
directory
Aug 29 12:09:48 recover_one(1397) 56, 2, 1
Aug 29 12:09:48 __recover_one(1176) recover obj a34c67000007e6 from epoch 3
Aug 29 12:09:48 find_tgt_node(1107) 27, 64, 56, 128, 1
Aug 29 12:09:48 find_tgt_node(1146) 4294967295, 1, 57
Aug 29 12:09:48 __recover_one(1181) cannot find target node, a34c67000007e6
Aug 29 12:09:48 __recover_one(1176) recover obj a34c67000007e6 from epoch 3
Aug 29 12:09:48 find_tgt_node(1107) 27, 64, 56, 128, 0
Aug 29 12:09:48 find_tgt_node(1114) 27, 0, 56, 128
...

Node2 has lots of messages in /mnt/sheepdog/sheepdog.log like these:
...
Aug 24 12:07:12 __recover_one(1176) recover obj a34c6700000040 from epoch 1
Aug 24 12:07:12 find_tgt_node(1107) 29, 64, 59, 128, 0
Aug 24 12:07:12 find_tgt_node(1146) 29, 0, 59
Aug 24 12:07:12 __recover_one(1277) rename
/mnt/sheepdog//obj/00000002/00a34c6700000040.tmp to
/mnt/sheepdog//obj/00000002/00a34c6700000040
Aug 24 12:07:12 __recover_one(1283) recovered oid a34c6700000040 to epoch 2
Aug 24 12:07:12 recover_one(1340) 584 1140,   a34c67000008f8
Aug 24 12:07:12 ob_open(491) failed to open
/mnt/sheepdog//obj/00000002/00a34c67000008f8, No such file or
directory
Aug 24 12:07:12 recover_one(1397) 59, 2, 0
Aug 24 12:07:12 __recover_one(1176) recover obj a34c67000008f8 from epoch 1
Aug 24 12:07:12 find_tgt_node(1107) 29, 64, 59, 128, 0
Aug 24 12:07:12 find_tgt_node(1146) 29, 0, 59
Aug 24 12:07:12 __recover_one(1277) rename
/mnt/sheepdog//obj/00000002/00a34c67000008f8.tmp to
/mnt/sheepdog//obj/00000002/00a34c67000008f8
Aug 24 12:07:12 __recover_one(1283) recovered oid a34c67000008f8 to epoch 2
Aug 24 12:07:12 recover_one(1340) 585 1140,   a34c6700000949
Aug 24 12:07:12 ob_open(491) failed to open
/mnt/sheepdog//obj/00000002/00a34c6700000949, No such file or
directory
Aug 24 12:07:12 recover_one(1397) 60, 2, 1
Aug 24 12:07:12 __recover_one(1176) recover obj a34c6700000949 from epoch 1
Aug 24 12:07:12 find_tgt_node(1107) 29, 64, 60, 128, 1
Aug 24 12:07:12 find_tgt_node(1146) 4294967295, 1, 61
Aug 24 12:07:12 __recover_one(1181) cannot find target node, a34c6700000949
Aug 24 12:07:12 __recover_one(1176) recover obj a34c6700000949 from epoch 1
Aug 24 12:07:12 find_tgt_node(1107) 29, 64, 60, 128, 0
Aug 24 12:07:12 find_tgt_node(1114) 29, 0, 60, 128
Aug 24 12:07:12 __recover_one(1277) rename
/mnt/sheepdog//obj/00000002/00a34c6700000949.tmp to
/mnt/sheepdog//obj/00000002/00a34c6700000949
Aug 24 12:07:12 __recover_one(1283) recovered oid a34c6700000949 to epoch 2
Aug 24 12:07:12 recover_one(1340) 586 1140,   a34c670000062b
Aug 24 12:07:12 ob_open(491) failed to open
/mnt/sheepdog//obj/00000002/00a34c670000062b, No such file or
directory
Aug 24 12:07:12 recover_one(1397) 60, 2, 1
....


Node3 has some
...
Aug 29 12:06:51 cpg_event_done(1316) 0x37d8890
Aug 29 12:06:51 __sd_deliver_done(980) op: 1, state: 1, size: 32840,
from: ::c0a8:21b:581b:4000:0:0:2
Aug 29 12:06:51 send_join_response(939) 3409 453159104
Aug 29 12:06:51 join(484) joining node send a wrong version message
Aug 29 12:06:51 cpg_event_done(1370) free 0x37d8890
Aug 29 12:06:51 sd_deliver(1012) op: 1, state: 3, size: 32840, from:
::c0a8:21b:581b:4000:0:0:1, nodeid: 990030016, pid: 7976
Aug 29 12:06:51 sd_deliver(1021) allow new deliver, 0x37d8890
Aug 29 12:06:51 start_cpg_event_work(1448) 0 1
Aug 29 12:06:51 cpg_event_fn(1280) 0x37d8890, 1 2
Aug 29 12:06:51 cpg_event_fn(1294) 3
Aug 29 12:06:51 __sd_deliver(861) op: 1, state: 3, size: 32840, from:
::c0a8:21b:581b:4000:0:0:1, pid: 3409
Aug 29 12:06:51 cpg_event_done(1316) 0x37d8890
Aug 29 12:06:51 __sd_deliver_done(980) op: 1, state: 3, size: 32840,
from: ::c0a8:21b:581b:4000:0:0:1
Aug 29 12:06:51 cpg_event_done(1370) free 0x37d8890
Aug 29 12:06:51 sd_deliver(1012) op: 1, state: 3, size: 32840, from:
::c0a8:21b:581b:4000:0:0:2, nodeid: 520267968, pid: 21674
Aug 29 12:06:51 sd_deliver(1021) allow new deliver, 0x37d8890
Aug 29 12:06:51 start_cpg_event_work(1448) 0 1
Aug 29 12:06:51 cpg_event_fn(1280) 0x37d8890, 1 2
Aug 29 12:06:51 cpg_event_fn(1294) 3
Aug 29 12:06:51 __sd_deliver(861) op: 1, state: 3, size: 32840, from:
::c0a8:21b:581b:4000:0:0:2, pid: 3409
Aug 29 12:06:51 cpg_event_done(1316) 0x37d8890
Aug 29 12:06:51 __sd_deliver_done(980) op: 1, state: 3, size: 32840,
from: ::c0a8:21b:581b:4000:0:0:2
Aug 29 12:06:51 cpg_event_done(1370) free 0x37d8890
Aug 29 12:10:17 listen_handler(523) accepted a new connection, 11
Aug 29 12:10:17 start_cpg_event_work(1448) 0 2
Aug 29 12:10:17 cluster_queue_request(255) 0x37d8990 87
Aug 29 12:10:17 client_handler(484) closed a connection, 11
Aug 29 12:13:08 listen_handler(523) accepted a new connection, 11
Aug 29 12:13:08 start_cpg_event_work(1448) 0 2
Aug 29 12:13:08 cluster_queue_request(255) 0x37d8960 87
Aug 29 12:13:08 client_handler(484) closed a connection, 11
...



More information about the sheepdog mailing list