[sheepdog] [PATCH 2/2] sockfd cache: revalidate node after grab failure

Liu Yuan namei.unix at gmail.com
Mon Dec 3 08:36:38 CET 2012


From: Liu Yuan <tailai.ly at taobao.com>

To fix following case:

The node is deleted, but someone askes us to grab it.The nid is not in the
sockfd cache but probably it might be still alive due to broken network
connection or was just too busy to serve any request that makes other nodes
deleted it from the sockfd cache. In such cases, we need to add it back.

Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
---
 sheep/sockfd_cache.c |   29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/sheep/sockfd_cache.c b/sheep/sockfd_cache.c
index 33e780b..e0956fb 100644
--- a/sheep/sockfd_cache.c
+++ b/sheep/sockfd_cache.c
@@ -333,15 +333,40 @@ static inline void check_idx(int idx)
 	queue_work(sys->sockfd_wqueue, w);
 }
 
+/* Add the node back if it is still alive */
+static inline int revalidate_node(const struct node_id *nid, char *name)
+{
+	int fd;
+
+	fd = connect_to(name, nid->port);
+	if (fd < 0)
+		return -1;
+	close(fd);
+	sockfd_cache_add(nid);
+
+	return 0;
+}
+
 static struct sockfd *sockfd_cache_get(const struct node_id *nid, char *name)
 {
 	struct sockfd_cache_entry *entry;
 	struct sockfd *sfd;
 	int fd, idx;
 
+grab:
 	entry = sockfd_cache_grab(nid, name, &idx);
-	if (!entry)
-		return NULL;
+	if (!entry) {
+		/*
+		 * The node is deleted, but someone askes us to grab it.
+		 * The nid is not in the sockfd cache but probably it might be
+		 * still alive due to broken network connection or was just too
+		 * busy to serve any request that makes other nodes deleted it
+		 * from the sockfd cache. In such cases, we need to add it back.
+		 */
+		if (revalidate_node(nid, name) < 0)
+			return NULL;
+		goto grab;
+	}
 
 	check_idx(idx);
 
-- 
1.7.9.5




More information about the sheepdog mailing list