[Sheepdog] Sheepdog 0.3.0 schedule and 0.4.0 plan
Chris Webb
chris at arachsys.com
Fri Nov 25 15:13:45 CET 2011
Hi. I've just tried the new HEAD of devel, 99d7c0f327, and now the machine
still in the network after a node has been killed seem never to eliminate it
and recover:
0028# ip link set eth1 down
0026# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
[long hang]
failed to connect to 172.16.101.11:7001: No route to host
failed to connect 172.16.101.11:7001
failed to read a inode header
failed to connect to 172.16.101.11:7000: No route to host
failed to connect 172.16.101.11:7000
failed to read a inode header
[...wait a minute or two...]
0026# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
failed to connect to 172.16.101.11:7001: No route to host
failed to connect 172.16.101.11:7001
failed to read a inode header
failed to connect to 172.16.101.11:7000: No route to host
failed to connect 172.16.101.11:7000
failed to read a inode header
[...and even after ten minutes...]
0026# collie node list
Idx - Host:Port Vnodes Zone
---------------------------------------------
0 - 172.16.101.7:7000 64 124063916
1 - 172.16.101.7:7001 64 124063916
2 - 172.16.101.7:7002 64 124063916
3 - 172.16.101.9:7000 64 157618348
4 - 172.16.101.9:7001 64 157618348
5 - 172.16.101.9:7002 64 157618348
6 - 172.16.101.11:7000 64 191172780
7 - 172.16.101.11:7001 64 191172780
8 - 172.16.101.11:7002 64 191172780
0026# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
failed to connect to 172.16.101.11:7001: No route to host
failed to connect 172.16.101.11:7001
failed to read a inode header
failed to connect to 172.16.101.11:7000: No route to host
failed to connect 172.16.101.11:7000
failed to read a inode header
0026# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
failed to connect to 172.16.101.11:7001: No route to host
failed to connect 172.16.101.11:7001
failed to read a inode header
failed to connect to 172.16.101.11:7000: No route to host
failed to connect 172.16.101.11:7000
failed to read a inode header
0026# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
failed to connect to 172.16.101.11:7001: No route to host
failed to connect 172.16.101.11:7001
failed to read a inode header
failed to connect to 172.16.101.11:7000: No route to host
failed to connect 172.16.101.11:7000
failed to read a inode header
0026# collie node list
Idx - Host:Port Vnodes Zone
---------------------------------------------
0 - 172.16.101.7:7000 64 124063916
1 - 172.16.101.7:7001 64 124063916
2 - 172.16.101.7:7002 64 124063916
3 - 172.16.101.9:7000 64 157618348
4 - 172.16.101.9:7001 64 157618348
5 - 172.16.101.9:7002 64 157618348
6 - 172.16.101.11:7000 64 191172780
7 - 172.16.101.11:7001 64 191172780
8 - 172.16.101.11:7002 64 191172780
0026# collie node list
Idx - Host:Port Vnodes Zone
---------------------------------------------
0 - 172.16.101.7:7000 64 124063916
1 - 172.16.101.7:7001 64 124063916
2 - 172.16.101.7:7002 64 124063916
3 - 172.16.101.9:7000 64 157618348
4 - 172.16.101.9:7001 64 157618348
5 - 172.16.101.9:7002 64 157618348
6 - 172.16.101.11:7000 64 191172780
7 - 172.16.101.11:7001 64 191172780
8 - 172.16.101.11:7002 64 191172780
0026# collie node list
Idx - Host:Port Vnodes Zone
---------------------------------------------
0 - 172.16.101.7:7000 64 124063916
1 - 172.16.101.7:7001 64 124063916
2 - 172.16.101.7:7002 64 124063916
3 - 172.16.101.9:7000 64 157618348
4 - 172.16.101.9:7001 64 157618348
5 - 172.16.101.9:7002 64 157618348
6 - 172.16.101.11:7000 64 191172780
7 - 172.16.101.11:7001 64 191172780
8 - 172.16.101.11:7002 64 191172780
0026# echo /dev/sd[abcdefghijk]1
/dev/sda1 /dev/sdb1 /dev/sdc1
0026# echo /dev/sd[a-k]1
/dev/sda1 /dev/sdb1 /dev/sdc1
0026# collie node list
Idx - Host:Port Vnodes Zone
---------------------------------------------
0 - 172.16.101.7:7000 64 124063916
1 - 172.16.101.7:7001 64 124063916
2 - 172.16.101.7:7002 64 124063916
3 - 172.16.101.9:7000 64 157618348
4 - 172.16.101.9:7001 64 157618348
5 - 172.16.101.9:7002 64 157618348
6 - 172.16.101.11:7000 64 191172780
7 - 172.16.101.11:7001 64 191172780
8 - 172.16.101.11:7002 64 191172780
0026# collie vdi list
name id size used shared creation time vdi id
------------------------------------------------------------------
failed to connect to 172.16.101.11:7001: No route to host
failed to connect 172.16.101.11:7001
failed to read a inode header
failed to connect to 172.16.101.11:7000: No route to host
failed to connect 172.16.101.11:7000
failed to read a inode header
I even powered off the 0028 machine to ensure I was fully isolating it, but the
cluster never recovers.
Best wishes,
Chris.
More information about the sheepdog
mailing list