[sheepdog-users] SIGABRT when doing: dog vdi check

Sat Jan 4 06:28:27 CET 2014

On Fri, Jan 03, 2014 at 10:51:26PM +0100, Marcin Mirosław wrote:
> Hi!
> I'm new on "sheep-run";) I'm starting to try sheepdog so probably I'm
> doing many things wrongly.
> I'm playing with sheepdog-0.7.6.
> 
> First problem (SIGABRT):
> I started multi sheep daemeon on localhost:
> # for x in 0 1 2 3 4; do sheep -c local -j size=128M -p 700$x
> /mnt/sheep/metadata/$x,/mnt/sheep/storage/$x; done
> 
> Next:
> # dog cluster info
> Cluster status: Waiting for cluster to be formatted
> 
> # dog cluster format -c 2:1

0.7.6 doesn't support erasure code. Try latest master branch

> using backend plain store
> # dog cluster info
> Cluster status: running, auto-recovery enabled
> 
> Cluster created at Fri Jan  3 20:33:43 2014
> 
> Epoch Time           Version
> 2014-01-03 20:33:43      1 [127.0.0.1:7000, 127.0.0.1:7001,
> 127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
> # dog vdi create testowy 5G
> # gdb -q dog
> Reading symbols from /usr/sbin/dog...Reading symbols from
> /usr/lib64/debug/usr/sbin/dog.debug...done.
> done.
> (gdb)  set args  vdi check testowy
> (gdb) run
> Starting program: /usr/sbin/dog vdi check testowy
> warning: Could not load shared library symbols for linux-vdso.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> warning: File "/lib64/libthread_db-1.0.so" auto-loading has been
> declined by your `auto-load safe-path' set to
> "$debugdir:$datadir/auto-load".
> To enable execution of this file add
>         add-auto-load-safe-path /lib64/libthread_db-1.0.so
> line to your configuration file "/root/.gdbinit".
> To completely disable this security protection add
>         set auto-load safe-path /
> line to your configuration file "/root/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual.  E.g., run from the
> shell:
>         info "(gdb)Auto-loading safe path"
> warning: Unable to find libthread_db matching inferior's thread library,
> thread debugging will not be available.
> PANIC: can't find next new idx

seems that 0.7.x series is cracky about it. Hitoshi, can you verify this?

If you are playing around sheepdog, I'd suggest you try

$ git clone https://github.com/sheepdog/sheepdog.git

which is the latest.

>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff784e2c5 in raise () from /lib64/libc.so.6
> (gdb) bt
> #0  0x00007ffff784e2c5 in raise () from /lib64/libc.so.6
> #1  0x00007ffff784f748 in abort () from /lib64/libc.so.6
> #2  0x000055555556192c in get_vnode_next_idx (nr_prev_idxs=<optimized
> out>, prev_idxs=<optimized out>, nr_entries=<optimized out>,
>     entries=<optimized out>) at ../include/sheep.h:105
> #3  oid_to_vnodes (vnodes=0x7fffffffe0c0, nr_copies=2, oid=<optimized
> out>, nr_entries=320, entries=<optimized out>)
>     at ../include/sheep.h:174
> #4  oid_to_vnodes (vnodes=0x7fffffffe0c0, nr_copies=2, oid=<optimized
> out>, nr_entries=320, entries=<optimized out>) at vdi.c:1586
> #5  queue_vdi_check_work (oid=<optimized out>, done=done at entry=0x0,
> wq=wq at entry=0x555556321420, inode=0x7ffff6c13010, inode=0x7ffff6c13010)
>     at vdi.c:1600
> #6  0x00005555555632e8 in vdi_check (argc=<optimized out>,
> argv=<optimized out>) at vdi.c:1634
> #7  0x000055555555bc77 in main (argc=4, argv=0x7fffffffe308) at dog.c:436
> 
> Second problem :
> Using previously created vdi I'm mounting sheepfs:
> # sheepfs  /mnt/test/
> Next:
> # echo testowy >/mnt/test/vdi/mount
> # mkfs.ext4 -q /mnt/test/volume/testowy
> /mnt/test/volume/testowy is not a block special device.
> Proceed anyway? (y,n) y
> # mount -o noatime /mnt/test/volume/testowy /mnt/sheep_test/
> # dd if=/dev/zero of=//mnt/sheep_test/zeroes bs=1M count=50
> 50+0 records in
> 50+0 records out
> 52428800 bytes (52 MB) copied, 0,108502 s, 483 MB/s
> 
> Next I'm stoping one sheep daemon, new situation is as below:
> # dog cluster  info
> Cluster status: running, auto-recovery enabled
> 
> Cluster created at Fri Jan  3 20:33:43 2014
> 
> Epoch Time           Version
> 2014-01-03 21:02:40      2 [127.0.0.1:7000, 127.0.0.1:7001,
> 127.0.0.1:7002, 127.0.0.1:7003]
> 2014-01-03 20:33:43      1 [127.0.0.1:7000, 127.0.0.1:7001,
> 127.0.0.1:7002, 127.0.0.1:7003, 127.0.0.1:7004]
> 
> Now my test files is unaccessible:
> # md5sum /mnt/sheep_test/zeroes
> md5sum: /mnt/sheep_test/zeroes: Input/output error
> 
> # cat /mnt/test/vdi/list
>   Name        Id    Size    Used  Shared    Creation time   VDI id
> Copies  Tag
>   testowy      0  5.0 GB  272 MB  0.0 MB 2014-01-03 20:34   cac836     2
> 
> 
> Shouldn't be my vdi "testowy" still be available even when one node is
> down? (I'll attach sheep.log at the end of email.)

Yes, sure. could you verify latest master have this problem?

> 
> 
> I'd like ask you for advice about proper (for my purposes) configuration
> of sheep "cluster". I'd like to prepare one-node storage for keeping
> backups. I'm going to use a few HDDs (from 2 to 5 units) (I think I need
> to use "Multi disk on Single Node Support"). I'd like to have enough
> redundancy to survive one HDD failure ( I'm thinking about using
> "Erasure Code Support" and 2:1 or 4:1 redundancy). Also I'd like to have
> flexibility of adding or removing HDD from sheepdog's cluster. (I think
> that such kind of flexibility mentioned isn't possibly). After reading
> wiki I think almost everything above is possible, am I right?

Yes, Sheepdog's Multi-Disk feature support hotplug, hotunplug of any number of
disks to any node. node here means host that have one or more disks.

> Should I use one daemon per node or multi sheeps on one node to do it?
> (I think one daemon is enough but wiki says: "You need at least X alive
> nodes (e.g, 4 nodes in 4:2 scheme) to serve the read/write request. If
> number of nodes drops to below X, the cluster will deny of service. Note
> that if you only have X nodes in the cluster, it means you don't have
> any redundancy parity generated."
> So I'm not sure if one or multi daemon mode I should configure.

If you have only one storage host but want to sheepdog to manage it like raid5,
then you need 1:1 map, that one daemon per disk. This means you actually run
N nodes in the same host. For example, you have 5 disks with 5 daemons setup,
and you can format as 4:1, then cluster(all nodes happen to run in the same node)
will manage all 5 disks exactly like raid5.

> Last question is about checksumming of data. Is it better to lay sheep
> on ext4 and use btrfs/zfs on the VDI or lay sheep on btrfs and use ext4
> on top of VDI?
> 

Either is okay, since sheepdog provide block device abstraction and don't care
how you use it. For file system that you lay sheep on, ext4 or xfs is suggested
since we just expect posix xattr support on the underlying filesystem.

By the way, if you are only interested in block device (not for VM), you can take
a look at iSCSI (https://github.com/sheepdog/sheepdog/wiki/General-protocol-support-%28iSCSI-and-NBD%29#iscsi)

Which probably will outperform sheepfs because sheepfs is based on fuse and
performanc is heavily constrained by it.

Thanks
Yuan