[sheepdog] [ANNOUNCE] earthquake: a framework for distributed systems debuggers

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Mon Dec 8 08:16:19 CET 2014


Hi sheepdog developers and users,

I'd like to let you know about earthquake project, a framework for
distributed systems debuggers focusing on non deterministic behavior
and hardware faults.

As you already know well, many critical bugs of sheepdog come from
below two factors:
1. non deterministic behavior of multi process and networked
   environment
2. hardware faults which trigger recovery sequence, very important but
   hard to test stuff

Bugs produced by the above two factors are known as hard to be removed
via ordinal debugging techniques. earthquake is trying to solve this
problem. It let target distributed system proceed in a deterministic
manner forcibly via source code translation (currently, earthquake
enables it by LLVM libtooling based translation for C programs). In
addition, it cooperates with fault injectors in virtual device of
QEMU. With this two feature, earthquake enables fine grained fault
injection e.g. disk fault when sheep A is in state S0, sheep B is in
S1, QEMU is in S2.

I'll post patches for applying earthquake for debugging sheepdog
soon. If other developers are interested in it, comments and questions
are welcome.

Although it is very, VERY alpha status (maybe I'm an only person who
can use it), if you are interested in it, you can obtain the source
code from here: https://github.com/osrg/earthquake

Thanks,
Hitoshi



More information about the sheepdog mailing list