[sheepdog] [PATCH v4 2/2] tests: add a DynamoRIO client for testing the jounaling mechanism

Kai Zhang kyle at zelin.io
Mon Jul 8 10:14:41 CEST 2013


On Jul 5, 2013, at 11:06 AM, Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp> wrote:

> This patch adds a DynamoRIO (often called DR) client for testing the
> jounaling mechanism. Because of its nature, the recoverying path is
> the most important and hard to test part of the journaling
> mechanism. They need to be tested well.
> 
> But testing targetted recovery paths with traditional tests/ stuff is
> hard because:
> 1. killing sheeps with kill commands doesn't take into account the
>   internal state
> 2. inserting exit()s into sheep manually is a painful work
> 
> So this patch implements a fault injection mechanism with DR. DR
> provides rich functionalities of transparent dynamic
> instrumentation. One of the functionalities makes inserting function
> calls before and after system calls possible. With this mechanism, the
> fault injection mechanism lets sheep exit at suitable timings for
> testing recovery paths of the journaling.
> 
> How to use:
> 0. preparation
>   $ cd
>   $ svn checkout http://dynamorio.googlecode.com/svn/trunk/ dynamorio
>   $ cd dynamorio
>   $ mkdir build
>   $ cd build
>   $ cmake ..
>   $ make
> 
> (This patch assumes the source code of DR is store in $HOME/dynamorio,
> and the build is done in $HOME/dynamorio/build)
> 

Is it possible to remove this assumption?


> 1. build the DR client
>   $ cd tests/dynamorio/journaling/
>   $ cmake .
>   $ make
> 

Can we use autotools for compiling?
So that we can have a uniformed compile method.
Is there a reason that we have to use cmake?


> 2. run tests with preset scenarios
>   $ ./01.sh 	  # for testing recovery of object store
>     		  # after this, actual completion of recovery can be
> 		    checked via sheep.log
> 
> The fault injection implemented with this patch is so slack and not
> capable for exhaustive testing. This is only a supoprted care. But I
> believe it is useful.
> 
> With this patch, I tested the recovery path for object store and
> checked it work well.
> 
> Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> ---
> v2: DR now supports signalfd(), so we can use the DR client with
>    shepherd (timerfd() is not supported yet, but its only user is the
>    local cluster driver).
> 
> v3: rename fi/ -> dynamorio/
> 
> v4: various refactoring.
>    e.g. extracting common parts from journaling.c
> 
> tests/dynamorio/.gitignore                |    7 +
> tests/dynamorio/journaling/01.sh          |   21 +++
> tests/dynamorio/journaling/CMakeLists.txt |   10 +
> tests/dynamorio/journaling/journaling.c   |  287 +++++++++++++++++++++++++++++
> 4 files changed, 325 insertions(+)
> create mode 100644 tests/dynamorio/.gitignore
> create mode 100755 tests/dynamorio/journaling/01.sh
> create mode 100644 tests/dynamorio/journaling/CMakeLists.txt
> create mode 100644 tests/dynamorio/journaling/journaling.c
> 
> diff --git a/tests/dynamorio/.gitignore b/tests/dynamorio/.gitignore
> new file mode 100644
> index 0000000..e57e010
> --- /dev/null
> +++ b/tests/dynamorio/.gitignore
> @@ -0,0 +1,7 @@
> +*/CMakeCache.txt
> +*/CMakeFiles
> +*/Makefile
> +*/cmake_install.cmake
> +*/ldscript
> +*/*.so
> +*/fi.log
> diff --git a/tests/dynamorio/journaling/01.sh b/tests/dynamorio/journaling/01.sh
> new file mode 100755
> index 0000000..87ea5ac
> --- /dev/null
> +++ b/tests/dynamorio/journaling/01.sh
> @@ -0,0 +1,21 @@
> +#! /bin/bash
> +
> +# fault injection for testing journaling with object store
> +
> +sudo killall -KILL sheep
> +sudo killall -KILL shepherd
> +
> +sudo rm -rf /tmp/sheepdog/fi/*
> +sudo mkdir -p /tmp/sheepdog/fi/0
> +
> +sudo shepherd
> +
> +sudo ~/dynamorio/build/bin64/drrun -c libjournaling.so 1 -- \
> +    sheep -d -c shepherd:127.0.0.1 -p 7000 -j size=64 /tmp/sheepdog/fi/0
> +
> +sleep 3
> +
> +collie cluster format -c 1
> +collie vdi create test 100M
> +
> +sudo sheep -d -c shepherd:127.0.0.1 -p 7000 -j size=64 /tmp/sheepdog/fi/0
> diff --git a/tests/dynamorio/journaling/CMakeLists.txt b/tests/dynamorio/journaling/CMakeLists.txt
> new file mode 100644
> index 0000000..d800414
> --- /dev/null
> +++ b/tests/dynamorio/journaling/CMakeLists.txt
> @@ -0,0 +1,10 @@
> +cmake_minimum_required(VERSION 2.8)
> +
> +SET(DynamoRIO_DIR "~/dynamorio/exports/cmake")
> +find_package(DynamoRIO)
> +
> +add_library(journaling SHARED journaling.c ../common.c)
> +configure_DynamoRIO_client(journaling)
> +
> +use_DynamoRIO_extension(journaling drwrap)
> +use_DynamoRIO_extension(journaling drcontainers)
> diff --git a/tests/dynamorio/journaling/journaling.c b/tests/dynamorio/journaling/journaling.c
> new file mode 100644
> index 0000000..228df6e
> --- /dev/null
> +++ b/tests/dynamorio/journaling/journaling.c
> @@ -0,0 +1,287 @@
> +/*
> + * Copyright (C) 2013 Nippon Telegraph and Telephone Corporation.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +/*
> + * journaling.c: DynamoRIO based fault injector for testing the journaling
> + * mechanism of sheep
> + */
> +
> +#include "dr_api.h"
> +#include "drwrap.h"
> +#include "drmgr.h"
> +#include "hashtable.h"
> +#include "dr_tools.h"
> +
> +#include "../common.h"
> +
> +/* for data structures and macros of the journaling subsystem */
> +#include "../../../include/internal_proto.h"
> +
> +#include <string.h>
> +#include <syscall.h>
> +
> +#include <stdint.h>
> +
> +enum scenario_id {
> +	SID_UNDEF = -1,
> +
> +	SID_DO_NOTHING = 0,
> +	SID_DEATH_AFTER_STORE,
> +};
> +
> +enum scenario_id sid = SID_UNDEF;
> +
> +static int tls_idx;
> +static int jfile_fds[2];
> +
> +enum thread_state {
> +	THREAD_STATE_DEFAULT,
> +
> +	THREAD_STATE_OPENING_JFILE_0,
> +	THREAD_STATE_OPENING_JFILE_1,
> +
> +	THREAD_STATE_WRITING_JFILE,
> +};
> +
> +enum pwrite_state {
> +	PWRITE_WRITING_STORE,
> +};
> +
> +struct per_thread_journal_state {
> +	enum thread_state state;
> +	int using_fd;
> +
> +	enum pwrite_state pwrite_state;
> +};
> +
> +static void thread_init_event(void *drcontext)
> +{
> +	struct per_thread_journal_state *new_jstate;
> +
> +	new_jstate = xzalloc(sizeof(*new_jstate));
> +
> +	drmgr_set_tls_field(drcontext, tls_idx, new_jstate);
> +}
> +
> +static void thread_exit_event(void *drcontext)
> +{
> +	struct per_thread_journal_state *jstate;
> +
> +	jstate = (struct per_thread_journal_state *)
> +		drmgr_get_tls_field(drcontext, tls_idx);
> +	xfree(jstate);
> +}
> +
> +static void pre_open(void *drcontext)
> +{
> +	const char *path;
> +	struct per_thread_journal_state *jstate;
> +
> +	jstate = (struct per_thread_journal_state *)
> +		drmgr_get_tls_field(drcontext, tls_idx);
> +
> +	path = (const char *)dr_syscall_get_param(drcontext, 0);
> +
> +	if (strstr(path, "journal_file0")) {
> +		fi_printf("journal_file0 is opened\n");
> +		DR_ASSERT(jstate->state == THREAD_STATE_DEFAULT);
> +		jstate->state = THREAD_STATE_OPENING_JFILE_0;
> +	} else if (strstr(path, "journal_file1")) {
> +		fi_printf("journal_file1 is opened\n");
> +		DR_ASSERT(jstate->state == THREAD_STATE_DEFAULT);
> +		jstate->state = THREAD_STATE_OPENING_JFILE_1;
> +	}
> +}
> +
> +static void pre_close(void *drcontext)
> +{
> +}
> +
> +static void pre_read(void *drcontext)
> +{
> +}
> +
> +static void pre_write(void *drcontext)
> +{
> +}
> +
> +static void pre_pwrite(void *drcontext)
> +{
> +	int fd;
> +	struct per_thread_journal_state *jstate;
> +	struct journal_descriptor *jd;
> +
> +	fd = (int)dr_syscall_get_param(drcontext, 0);
> +	if (fd != jfile_fds[0] && fd != jfile_fds[1])
> +		return;
> +
> +	jstate = (struct per_thread_journal_state *)
> +		drmgr_get_tls_field(drcontext, tls_idx);
> +
> +	fi_printf("writing journal\n");
> +	jstate->using_fd = fd;
> +	jstate->state = THREAD_STATE_WRITING_JFILE;
> +
> +	jd = (struct journal_descriptor *)dr_syscall_get_param(drcontext, 1);
> +	DR_ASSERT(jd->magic == JOURNAL_DESC_MAGIC);
> +	if (jd->flag == JF_STORE)
> +		jstate->pwrite_state = PWRITE_WRITING_STORE;
> +	else if (jd->flag == JF_REMOVE_OBJ)
> +		fi_printf("FIXME: testing object removal is not supported yet");
> +	else
> +		die("unknown journal flag: %d\n", jd->flag);
> +}
> +
> +static bool pre_syscall(void *drcontext, int sysnum)
> +{
> +	switch (sysnum) {
> +	case SYS_open:
> +		pre_open(drcontext);
> +		break;
> +	case SYS_close:
> +		pre_close(drcontext);
> +		break;
> +	case SYS_read:
> +		pre_read(drcontext);
> +		break;
> +	case SYS_write:
> +		pre_write(drcontext);
> +		break;
> +	case SYS_pwrite64:
> +		pre_pwrite(drcontext);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return true;
> +}
> +
> +static void post_open(void *drcontext)
> +{
> +	int fd;
> +	struct per_thread_journal_state *jstate;
> +
> +	jstate = (struct per_thread_journal_state *)
> +		drmgr_get_tls_field(drcontext, tls_idx);
> +
> +	if (jstate->state == THREAD_STATE_DEFAULT)
> +		return;
> +
> +	fd = (int)dr_syscall_get_result(drcontext);
> +
> +	if (jstate->state == THREAD_STATE_OPENING_JFILE_0) {
> +		fi_printf("fd of jfile0: %d\n", fd);
> +		jfile_fds[0] = fd;
> +	} else if (jstate->state == THREAD_STATE_OPENING_JFILE_1) {
> +		fi_printf("fd of jfile1: %d\n", fd);
> +		jfile_fds[1] = fd;
> +	}
> +
> +	jstate->state = THREAD_STATE_DEFAULT;
> +}
> +
> +static void post_close(void *drcontext)
> +{
> +}
> +
> +static void post_read(void *drcontext)
> +{
> +}
> +
> +static void post_write(void *drcontext)
> +{
> +}
> +
> +static void post_pwrite64(void *drcontext)
> +{
> +	int fd;
> +	struct per_thread_journal_state *jstate;
> +
> +	jstate = (struct per_thread_journal_state *)
> +		drmgr_get_tls_field(drcontext, tls_idx);
> +
> +	if (jstate->state != THREAD_STATE_WRITING_JFILE)
> +		return;
> +
> +	fd = jstate->using_fd;
> +	DR_ASSERT(fd == jfile_fds[0] || fd == jfile_fds[1]);
> +
> +	switch (sid) {
> +	case SID_DEATH_AFTER_STORE:
> +		if (jstate->pwrite_state != PWRITE_WRITING_STORE)
> +			return;
> +
> +		fi_printf("scenario is death after writing normal store,"
> +			" exiting\n");
> +		exit(1);
> +		break;
> +	default:
> +		die("invalid SID: %d\n", sid);
> +		break;
> +	}
> +}
> +
> +static void post_syscall(void *drcontext, int sysnum)
> +{
> +	switch (sysnum) {
> +	case SYS_open:
> +		post_open(drcontext);
> +		break;
> +	case SYS_close:
> +		post_close(drcontext);
> +		break;
> +	case SYS_read:
> +		post_read(drcontext);
> +		break;
> +	case SYS_write:
> +		post_write(drcontext);
> +		break;
> +	case SYS_pwrite64:
> +		post_pwrite64(drcontext);
> +		break;
> +	}
> +}
> +
> +static bool pre_syscall_filter(void *drcontext, int sysnum)
> +{
> +	return true;
> +}
> +
> +static bool post_syscall_filter(void *drcontext, int sysnum)
> +{
> +	return true;
> +}
> +
> +DR_EXPORT void dr_init(client_id_t id)
> +{
> +	const char *option;
> +
> +	option = dr_get_options(id);
> +	fi_printf("the passed option to this client: %s\n", option);
> +	sid = atoi(option);
> +	fi_printf("sid: %d\n", sid);
> +
> +	init_log_file();
> +
> +	dr_register_filter_syscall_event(pre_syscall_filter);
> +	drmgr_init();
> +
> +	tls_idx = drmgr_register_tls_field();
> +	drmgr_register_pre_syscall_event(pre_syscall);
> +	drmgr_register_post_syscall_event(post_syscall);
> +
> +	drmgr_register_thread_init_event(thread_init_event);
> +	drmgr_register_thread_exit_event(thread_exit_event);
> +
> +	jfile_fds[0] = -1;
> +	jfile_fds[1] = -1;
> +}
> -- 
> 1.7.10.4
> 
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog




More information about the sheepdog mailing list