[Sheepdog] sheep rapes CPU time on CLOSE_WAIT socket

huxinwei huxinwei at huawei.com
Wed Feb 29 09:22:48 CET 2012


Hi list,

  Sheepdog eats 100% CPU time while the client is killed with brute force.
It turns out the socket stays in CLOSE_WAIT status, reading such socket returns 0. So the event_loop runs in a tight loop on these sockets forever.
Here's a proposed simple patch to work around.

  Please let me know if you guys have better idea.

  Thanks.

diff --git a/lib/net.c b/lib/net.c
index 260ae38..d754197 100644
--- a/lib/net.c
+++ b/lib/net.c
@@ -22,6 +22,7 @@
 #include <sys/socket.h>
 #include <sys/stat.h>
 #include <sys/types.h>
+#include <netinet/tcp.h>
 
 #include "sheepdog_proto.h"
 #include "util.h"
@@ -73,6 +74,17 @@ int rx(struct connection *conn, enum conn_state next_state)
 	if (!ret || ret < 0) {
 		if (errno != EAGAIN)
 			conn->c_rx_state = C_IO_CLOSED;
+		else if (!ret) { //FIXME: is this the best place to check?
+			int r;
+			struct tcp_info ti;
+			socklen_t l = sizeof(struct tcp_info);
+			r = getsockopt(conn->fd, SOL_TCP, TCP_INFO, (void*)&ti, &l);
+			if (r < 0) panic("this is really wired");
+			if (ti.tcpi_state == TCP_CLOSE_WAIT) {
+				conn->c_rx_state = C_IO_CLOSED;
+				conn->c_tx_state = C_IO_CLOSED;
+			}
+		}
 		return 0;
 	}




More information about the sheepdog mailing list