[stgt] [PATCH] new timer-based work scheduling

FUJITA Tomonori fujita.tomonori at lab.ntt.co.jp
Mon Jan 17 12:05:35 CET 2011


On Sun, 9 Jan 2011 18:02:20 +0200
Alexander Nezhinsky <alexandern at Voltaire.COM> wrote:

> Re-implementing the time-based work scheduler. This patch implements
> a timer-based scheme.
> 
> Work items are scheduled from various application contexts, and put on
> a queue. The current time from gettimeofday() is used instead of
> a jiffie-based mechanism used previosuly. The work item is stamped with
> the expiration time, obtained as the current time plus the timeout period.
> 
> A global timer (one per process) is registered and fired periodically,
> few times a second. When the timer signal is handled, a word is written
> to a dedicated pipe, whose fd is registered with the event loop.
> 
> The event handler reads from the pipe, and examines the inactive work
> queue. All items that have expired (the immediate current time ise used
> for comparison, not the timer or pipe event handling times) are moved
> to the active list and processed one after another. The items that
> might have accumulated and expired in the meantime are not handled
> and are postponed until the next timer event.
> 
> The new scheme handles the timer event through the standard event loop's
> file descritor registration and polling. This removes the previosly used
> call to schedule() function which had to update jiffies.
> 
> There is still no guarantee about the exact work execution time.
> This scheme is suitable for tasks with seconds-scale resolution, so that
> firing the timer few times per second provides a satisfactory accuracy.
> The assumption is that all event handlers take small periods of time.
> If not, in theory, the timer handlers may be delayed indefinitely,
> but then the entire event loop processing gets stalled.
> 
> The immediate clients of this mechansim is the iSNS code and the new
> iSER code to be based on it instead of the previously proposed
> custom timer.
> 
> Signed-off-by: Alexander Nezhinsky <alexandern at voltaire.com>
> ---
>  usr/tgtd.c |   13 ++++-
>  usr/work.c |  177 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  usr/work.h |    8 ++-
>  3 files changed, 171 insertions(+), 27 deletions(-)
> 
> diff --git a/usr/tgtd.c b/usr/tgtd.c
> index 2fd4959..5b000d3 100644
> --- a/usr/tgtd.c
> +++ b/usr/tgtd.c
> @@ -329,6 +329,8 @@ static int tgt_exec_scheduled(void)
>  	return work_remains;
>  }
>  
> +#define TGTD_EVENT_TIMEOUT	2000
> +
>  static void event_loop(void)
>  {
>  	int nevent, i, sched_remains, timeout;
> @@ -337,7 +339,7 @@ static void event_loop(void)
>  
>  retry:
>  	sched_remains = tgt_exec_scheduled();
> -	timeout = sched_remains ? 0 : TGTD_TICK_PERIOD * 1000;
> +	timeout = sched_remains ? 0 : TGTD_EVENT_TIMEOUT;

Why we still need this?

We have the timer file descriptor so we don't need this timeout
anymore, right?


>  	nevent = epoll_wait(ep_fd, events, ARRAY_SIZE(events), timeout);
>  	if (nevent < 0) {
> @@ -350,8 +352,7 @@ retry:
>  			tev = (struct event_data *) events[i].data.ptr;
>  			tev->handler(tev->fd, events[i].events, tev->data);
>  		}
> -	} else
> -		schedule();
> +	}
>  
>  	if (system_active)
>  		goto retry;
> @@ -517,12 +518,18 @@ int main(int argc, char **argv)
>  		}
>  	}
>  
> +	err = work_timer_start();
> +	if (err)
> +		exit(1);
> +
>  	bs_init();
>  
>  	event_loop();
>  
>  	lld_exit();
>  
> +	work_timer_stop();
> +
>  	ipc_exit();
>  
>  	log_close();
> diff --git a/usr/work.c b/usr/work.c
> index 3080a59..f825581 100644
> --- a/usr/work.c
> +++ b/usr/work.c
> @@ -1,8 +1,9 @@
>  /*
> - * bogus scheduler
> + * work scheduler, loosely timer-based
>   *
>   * Copyright (C) 2006-2007 FUJITA Tomonori <tomof at acm.org>
>   * Copyright (C) 2006-2007 Mike Christie <michaelc at cs.wisc.edu>
> + * Copyright (C) 2011 Alexander Nezhinsky <alexandern at voltaire.com>
>   *
>   * This program is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU General Public License as
> @@ -21,36 +22,167 @@
>   */
>  #include <stdlib.h>
>  #include <stdint.h>
> +#include <signal.h>
> +#include <sys/epoll.h>
>  
>  #include "list.h"
>  #include "util.h"
>  #include "log.h"
>  #include "work.h"
> +#include "tgtd.h"
> +
> +#define time_before(w1, w2)     timercmp(w1, w2, <)

In general, the patch looks ok to me.

However, as I wrote in another mail, I prefer the timerfd code.

This signal code gets a signal, writes and reads via pipe, and calls
gettimeofday every 250 msecs.

On the other hand, the timerfd code just reads every 250 msecs.

If you don't care about old kernels much, I like to have the timerfd
and old jiffies code. If you care, I add the timerfd code and yours.
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the stgt mailing list