[sheepdog] [PATCH v3] tests: add a DynamoRIO client for testing the jounaling mechanism

Hitoshi Mitake mitake.hitoshi at gmail.com
Wed Jul 3 03:36:15 CEST 2013


At Wed, 03 Jul 2013 09:36:33 +0900,
MORITA Kazutaka wrote:
> 
> At Mon,  1 Jul 2013 18:23:01 +0900,
> Hitoshi Mitake wrote:
> > 
> > This patch adds a DynamoRIO (often called DR) client for testing the
> > jounaling mechanism. Because of its nature, the recoverying path is
> > the most important and hard to test part of the journaling
> > mechanism. They need to be tested well.
> > 
> > But testing targetted recovery paths with traditional tests/ stuff is
> > hard because:
> > 1. killing sheeps with kill commands doesn't take into account the
> >    internal state
> > 2. inserting exit()s into sheep manually is a painful work
> > 
> > So this patch implements a fault injection mechanism with DR. DR
> > provides rich functionalities of transparent dynamic
> > instrumentation. One of the functionalities makes inserting function
> > calls before and after system calls possible. With this mechanism, the
> > fault injection mechanism lets sheep exit at suitable timings for
> > testing recovery paths of the journaling.
> > 
> > How to use:
> > 0. preparation
> >    $ cd
> >    $ svn checkout http://dynamorio.googlecode.com/svn/trunk/ dynamorio
> >    $ cd dynamorio
> >    $ mkdir build
> >    $ cd build
> >    $ cmake ..
> >    $ make
> > 
> > (This patch assumes the source code of DR is store in $HOME/dynamorio,
> > and the build is done in $HOME/dynamorio/build)
> > 
> > 1. build the DR client
> >    $ cd tests/dynamorio/journaling/
> >    $ cmake .
> >    $ make
> > 
> > 2. run tests with preset scenarios
> >    $ ./01.sh 	  # for testing recovery of object store
> >      		  # after this, actual completion of recovery can be
> > 		    checked via sheep.log
> > 
> > The fault injection implemented with this patch is so slack and not
> > capable for exhaustive testing. This is only a supoprted care. But I
> > believe it is useful.
> > 
> > With this patch, I tested the recovery path for object store and
> > checked it work well.
> > 
> > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > ---
> > v2: DR now supports signalfd(), so we can use the DR client with
> >     shepherd (timerfd() is not supported yet, but its only user is the
> >     local cluster driver).
> 
> What happens if we use the local driver and call timerfd?

read() from timerfd never returns.

> 
> > 
> > v3: rename fi/ -> dynamorio/
> > 
> >  tests/dynamorio/.gitignore                |    6 +
> >  tests/dynamorio/journaling/01.sh          |   21 ++
> >  tests/dynamorio/journaling/CMakeLists.txt |   10 +
> >  tests/dynamorio/journaling/journaling.c   |  358 +++++++++++++++++++++++++++++
> >  4 files changed, 395 insertions(+)
> >  create mode 100644 tests/dynamorio/.gitignore
> >  create mode 100755 tests/dynamorio/journaling/01.sh
> >  create mode 100644 tests/dynamorio/journaling/CMakeLists.txt
> >  create mode 100644 tests/dynamorio/journaling/journaling.c
> > 
> > diff --git a/tests/dynamorio/.gitignore b/tests/dynamorio/.gitignore
> > new file mode 100644
> > index 0000000..09c1215
> > --- /dev/null
> > +++ b/tests/dynamorio/.gitignore
> > @@ -0,0 +1,6 @@
> > +*/CMakeCache.txt
> > +*/CMakeFiles
> > +*/Makefile
> > +*/cmake_install.cmake
> > +*/ldscript
> > +*/*.so
> > diff --git a/tests/dynamorio/journaling/01.sh b/tests/dynamorio/journaling/01.sh
> > new file mode 100755
> > index 0000000..dd2fb45
> > --- /dev/null
> > +++ b/tests/dynamorio/journaling/01.sh
> > @@ -0,0 +1,21 @@
> > +#! /bin/bash
> > +
> > +# fault injection for testing journaling with object store
> > +
> > +sudo killall -KILL sheep
> > +sudo killall -KILL shepherd
> > +
> > +sudo rm -rf /tmp/sheepdog*
> > +sudo mkdir -p /tmp/sheepdog/0
> > +
> > +sudo shepherd
> > +
> > +sudo ~/dynamorio/build/bin64/drrun -c libjournaling.so 1 --\
> > + sheep -d -c shepherd:127.0.0.1 -p 7000 -j size=64 /tmp/sheepdog/0
> > +
> > +sleep 3
> > +
> > +collie cluster format -c 1
> > +collie vdi create test 100M
> > +
> > +sudo sheep -d -c shepherd:127.0.0.1 -p 7000 -j size=64 /tmp/sheepdog/0
> > diff --git a/tests/dynamorio/journaling/CMakeLists.txt b/tests/dynamorio/journaling/CMakeLists.txt
> > new file mode 100644
> > index 0000000..fcfabbd
> > --- /dev/null
> > +++ b/tests/dynamorio/journaling/CMakeLists.txt
> > @@ -0,0 +1,10 @@
> > +cmake_minimum_required(VERSION 2.8)
> > +
> > +SET(DynamoRIO_DIR "~/dynamorio/exports/cmake")
> > +find_package(DynamoRIO)
> > +
> > +add_library(journaling SHARED journaling.c)
> > +configure_DynamoRIO_client(journaling)
> > +
> > +use_DynamoRIO_extension(journaling drwrap)
> > +use_DynamoRIO_extension(journaling drcontainers)
> > diff --git a/tests/dynamorio/journaling/journaling.c b/tests/dynamorio/journaling/journaling.c
> > new file mode 100644
> > index 0000000..4c604df
> > --- /dev/null
> > +++ b/tests/dynamorio/journaling/journaling.c
> > @@ -0,0 +1,358 @@
> > +/*
> > + * Copyright (C) 2013 Nippon Telegraph and Telephone Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License version
> > + * 2 as published by the Free Software Foundation.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +/*
> > + * DynamoRIO client for fault injection which can be used to test
> > + * journaling mechanism
> > + */
> > +
> > +#include "dr_api.h"
> > +#include "drwrap.h"
> > +#include "drmgr.h"
> > +#include "hashtable.h"
> > +#include "dr_tools.h"
> > +
> > +#include <string.h>
> > +#include <syscall.h>
> > +
> > +#include <stdint.h>
> > +
> > +struct journal_descriptor {
> > +	uint32_t magic;
> > +	uint16_t flag;
> > +	uint16_t reserved;
> > +	union {
> > +		uint32_t epoch;
> > +		uint64_t oid;
> > +	};
> > +	uint64_t offset;
> > +	uint64_t size;
> > +	uint8_t create;
> > +	uint8_t pad[475];
> > +} __packed;
> 
> This must be as same as the one in sheep/journal.c and I think it's
> likely to be broken when we update the definition.  I think we should
> define journal_descriptor in include/internal_proto.h and this file
> should include the header file.

Moving the definition of the struct would break the intention of the
journaling implementation. So I thought it shouldn't be done for
testing purpose.

There might be some effective ways for obtaining these kind of
information from the code of sheep.

1. parsing DWARF information and extracting definitions of data types
2. employing static analysis techniques which work on source code

I'd like to employ these kind of techniques in the future. But I think
doing it now is overkill because the definition of the
journal_descriptor is rarely changed and we have only one fault
injector which depends on internal data structures.

> 
> > +
> > +#define JOURNAL_DESC_MAGIC 0xfee1900d
> > +
> > +#define JF_STORE 0
> > +#define JF_REMOVE_OBJ 2
> > +
> > +enum scenario_id {
> > +	SID_UNDEF = -1,
> > +
> > +	SID_DO_NOTHING = 0,
> > +	SID_DEATH_AFTER_STORE,
> > +};
> > +
> > +enum scenario_id sid = SID_UNDEF;
> > +
> > +static int tls_idx;
> > +static file_t log_file = INVALID_FILE;
> > +
> > +static int jfile_fds[2];
> > +
> > +#define fi_printf(fmt, args...) do {					\
> > +		if (log_file == INVALID_FILE)				\
> > +			dr_printf("%s(%d), " fmt,			\
> > +				__func__, __LINE__, ## args);		\
> > +		else							\
> > +			dr_fprintf(log_file, "%s(%d), " fmt,		\
> > +				__func__, __LINE__, ## args);		\
> > +	} while (0)
> > +
> > +#define die(fmt, args...) do {						\
> > +		fi_printf("FATAL %s(%d), " fmt,				\
> > +			__func__, __LINE__, ## args);			\
> > +	} while (0)
> > +
> > +static void *xmalloc(size_t size)
> > +{
> > +	void *ret;
> > +
> > +	ret = __wrap_malloc(size);
> > +	if (!ret)
> > +		die("allocating memory with __wrap_malloc() failed\n");
> > +
> > +	return ret;
> > +}
> > +
> > +static void *xzalloc(size_t size)
> > +{
> > +	void *ret;
> > +
> > +	ret = xmalloc(size);
> > +	memset(ret, 0, size);
> > +
> > +	return ret;
> > +}
> > +
> > +static void *xcalloc(size_t size, size_t nmnb)
> > +{
> > +	void *ret;
> > +	size_t length = size * nmnb;
> > +
> > +	ret = __wrap_malloc(length);
> > +	if (!ret)
> > +		die("allocating memory with __wrap_malloc() failed\n");
> > +	memset(ret, 0, length);
> > +
> > +	return ret;
> > +}
> > +
> > +static void xfree(void *ptr)
> > +{
> > +	__wrap_free(ptr);
> > +}
> 
> No plan to use DynamoRIO for other sheepdog features than journaling?
> I guess some functions in this file should be defined in a more common
> place like common.c.

Of course I'll use DR for testing other parts of sheepdog
(e.g. shepherd). But I believe extracting common functions as APIs
should be done when we add 2nd one for better desgin. How do you
think?

Thanks,
Hitoshi




More information about the sheepdog mailing list