[sheepdog] [PATCH 1/2] test: add a test for sockfd keepalive

Tue Sep 4 06:08:32 CEST 2012

At Tue, 04 Sep 2012 01:16:00 +0800,
Liu Yuan wrote:
> 
> On 09/03/2012 11:52 PM, MORITA Kazutaka wrote:
> > Another approach:
> >  - set poll timeout, SO_SNDTIMEO, and SO_RCVTIMEO as we did before,
> >    but return SD_RES_NETWORK_ERROR only if epoch is incremented after
> >    timeout.
> 
> I tried set poll with timeout 5 seconds as before, but still get the same problem as
> set it '-1' & keepalive. So I think this problem exists at all the sheep version.
> 
> So I guess the problem cause our keepalive timer isn't fired also apply to user defined
> timer.

Can you give me the diff?  I tried the following draft patch, and it
seems that the problem has gone away.

diff --git a/sheep/gateway.c b/sheep/gateway.c
index 41d712b..6467eee 100644
--- a/sheep/gateway.c
+++ b/sheep/gateway.c
@@ -156,12 +156,21 @@ static int wait_forward_request(struct write_info *wi, struct sd_rsp *rsp)
 	struct pfd_info pi;;
 again:
 	pfd_info_init(wi, &pi);
-	pollret = poll(pi.pfds, pi.nr, -1);
+	pollret = poll(pi.pfds, pi.nr, 5000);
 	if (pollret < 0) {
 		if (errno == EINTR)
 			goto again;
 
 		panic("%m\n");
+	} else if (pollret == 0) {
+		eprintf("poll timeout\n");
+		/* FIXME: try again if epoch is not updated */
+		nr_sent = wi->nr_sent;
+		for (i = 0; i < nr_sent; i++)
+			finish_one_write_err(wi, 0);
+
+		err_ret = SD_RES_NETWORK_ERROR;
+		goto finish_write;
 	}
 
 	nr_sent = wi->nr_sent;