[Sheepdog] Cluster appears down but nodes report different epochs
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Tue Nov 8 06:33:39 CET 2011
At Mon, 7 Nov 2011 10:03:19 -0500,
Shawn Moore wrote:
>
> When I checked on the cluster this morning I see the following from
> cluster info. A sheep and corosync process was found on all nodes
> except blade162 which didn't have a sheep process but did have a
> corosync one. I'm not sure what has happened. We have not had a
In blade162.log:
Nov 05 00:06:30 sd_leave_handler(1222) Network Patition Bug: I should have exited.
Probably, this is a corosync's bug and Yunkai is trying to solve it.
http://lists.wpkg.org/pipermail/sheepdog/2011-November/001835.html
> network interruption that we are aware of as all nodes are on the same
> switch (along with countless other production systems). Logs from
> each node can be found
> http://www.stormpoint.com/files/sd_2011-11-07.zip. Total
> un-compressed size is ~ 254MB and this download size is around 21MB.
> When I left Friday, this is how our cluster looked:
>
> All nodes were running version 0.2.4_63_gd56e3b6
>
> Idx - Host:Port Vnodes Zone
> ---------------------------------------------
> 0 - 192.168.217.152:7000 64 1
> 1 - 192.168.217.153:7000 64 1
> 2 - 192.168.217.154:7000 64 1
> 3 - 192.168.217.155:7000 64 1
> 4 - 192.168.217.156:7000 64 1
> 5 - 192.168.217.157:7000 64 2
> 6 - 192.168.217.159:7000 64 2
> 7 - 192.168.217.160:7000 64 2
> 8 - 192.168.217.161:7000 64 2
> 9 - 192.168.217.162:7000 64 2
>
> [root at blade152 sheep]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-04 17:26:22 14 [192.168.217.152:7000]
> 2011-11-04 17:26:22 13 [192.168.217.152:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:22 12 [192.168.217.152:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:22 11 [192.168.217.152:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:22 10 [192.168.217.152:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.162:7000]
> 2011-11-04 17:26:21 9 [192.168.217.152:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:21 8 [192.168.217.152:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:21 7 [192.168.217.152:7000,
> 192.168.217.154:7000, 192.168.217.155:7000, 192.168.217.156:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.162:7000]
>
>
> [root at blade153 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-05 00:05:19 14 [192.168.217.153:7000]
> 2011-11-05 00:05:19 13 [192.168.217.153:7000, 192.168.217.162:7000]
> 2011-11-05 00:05:19 12 [192.168.217.153:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-05 00:05:19 11 [192.168.217.153:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-05 00:05:19 10 [192.168.217.153:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.162:7000]
> 2011-11-05 00:05:19 9 [192.168.217.153:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-05 00:05:18 8 [192.168.217.153:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-05 00:05:18 7 [192.168.217.153:7000,
> 192.168.217.154:7000, 192.168.217.155:7000, 192.168.217.156:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.162:7000]
>
>
> [root at blade154 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-04 13:25:06 6 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 06:58:12 5 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 05:57:43 4 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-02 10:49:34 3 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-02 10:33:44 2 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-02 07:01:26 1 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
>
>
> [root at blade155 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-04 13:24:42 6 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 06:57:48 5 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 05:57:19 4 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-02 10:49:07 3 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-02 10:33:17 2 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-02 07:00:59 1 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
>
>
> [root at blade156 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-05 07:39:11 9 [192.168.217.154:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000]
> 2011-11-05 07:39:11 8 [192.168.217.154:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 18:47:30 7 [192.168.217.153:7000,
> 192.168.217.154:7000, 192.168.217.155:7000, 192.168.217.156:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.162:7000]
> 2011-11-04 17:26:26 6 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 10:59:30 5 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 09:59:03 4 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 09:59:03 3 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-02 10:33:44 2 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
>
>
> [root at blade157 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-05 07:39:11 9 [192.168.217.154:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000]
> 2011-11-05 07:39:11 8 [192.168.217.154:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 18:47:30 7 [192.168.217.153:7000,
> 192.168.217.154:7000, 192.168.217.155:7000, 192.168.217.156:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.162:7000]
> 2011-11-04 17:26:26 6 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 10:59:32 5 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 10:59:32 4 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-02 10:49:34 3 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-02 10:33:44 2 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
>
>
> [root at blade159 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-04 17:26:11 6 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 10:59:17 5 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 09:58:48 4 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-02 14:50:37 3 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-02 14:34:46 2 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-02 11:02:28 1 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
>
>
> [root at blade160 ~]# collie cluster info
> Cluster status: running
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-04 17:26:26 6 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-04 10:59:30 5 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 09:59:02 4 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-02 14:50:46 3 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-02 14:34:55 2 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.162:7000]
> 2011-11-02 11:02:37 1 [192.168.217.152:7000,
> 192.168.217.153:7000, 192.168.217.154:7000, 192.168.217.155:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
>
>
> [root at blade161 ~]# collie cluster info
> Cluster status: The sheepdog is stopped doing IO, short of living nodes
>
> Cluster created at Wed Nov 2 11:02:26 2011
>
> Epoch Time Version
> 2011-11-04 17:26:51 14 [192.168.217.161:7000]
> 2011-11-04 17:26:51 13 [192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:51 12 [192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:51 11 [192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:48 10 [192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
> 2011-11-04 17:26:48 9 [192.168.217.156:7000,
> 192.168.217.157:7000, 192.168.217.159:7000, 192.168.217.160:7000,
> 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:48 8 [192.168.217.155:7000,
> 192.168.217.156:7000, 192.168.217.157:7000, 192.168.217.159:7000,
> 192.168.217.160:7000, 192.168.217.161:7000, 192.168.217.162:7000]
> 2011-11-04 17:26:48 7 [192.168.217.154:7000,
> 192.168.217.155:7000, 192.168.217.156:7000, 192.168.217.157:7000,
> 192.168.217.159:7000, 192.168.217.160:7000, 192.168.217.161:7000,
> 192.168.217.162:7000]
>
>
> [root at blade162 ~]# collie cluster info
> failed to connect to localhost:7000, Connection refused
> failed to connect to localhost:7000, Connection refused
It seems that a network partition is wrongly detected.
To make explanation simpler, I'll use the following labels for each
node:
n0: 192.168.217.152
n1: 192.168.217.153
n2: 192.168.217.154
n3: 192.168.217.155
n4: 192.168.217.156
n5: 192.168.217.157
n6: 192.168.217.159
n7: 192.168.217.160
n8: 192.168.217.161
n9: 192.168.217.162
I guess your cluster is splited into 5 groups;
{n0}, {n1}, {n2, n3, n4, n5, n6, n7}, {n8}, {n9}.
- n0 received a notification that n[1-9] were left.
- n1 received a notification that n0 and n[2-9] were left.
- n[2-7] received a notification that n0, n1, n8, and n9 were left.
- n8 received a notification that n[0-7] and n9 were left.
- n9 received a notification that n[0-8] were left (and aborted due to the above bug).
Currently, Sheepdog cannot handle this kinds of false detection.
We may avoid this problem if we set appropriate values to
corosync.conf (totem.merge or totem.seqno_unchanged_const?), but I'm
not sure. Does anyone know more about this?
Thanks,
Kazutaka
More information about the sheepdog
mailing list