[sheepdog] [sbd] I/O stuck shortly after starting writes

Marcin Mirosław marcin at mejor.pl
Sun Jun 1 21:56:00 CEST 2014


Hi!
I'm launching three sheeps locally, creating vdi with EC 2:1. Next I'm
starting sbd0 block device. mkfs.xfs /dev/sbd0 && mount ...
Next step is starting simple dd command: dd if=/dev/zero
of=/mnt/test/zero bs=4M count=2000
After short moment I've got man sheep stuck in D state:
sheepdog  4126  1.2  9.8 2199564 151052 ?      Sl   21:41   0:06
/usr/sbin/sheep -n --port 7000 -z 0 /mnt/sdb1 --pidfile
/run/sheepdog/sheepdog.sdb1
sheepdog  4127  0.0  0.0  34468   396 ?        Ds   21:41   0:00
/usr/sbin/sheep -n --port 7000 -z 0 /mnt/sdb1 --pidfile
/run/sheepdog/sheepdog.sdb1
sheepdog  4179  0.2  6.7 1855792 103780 ?      Sl   21:41   0:01
/usr/sbin/sheep -n --port 7001 -z 1 /mnt/sdc1 --pidfile
/run/sheepdog/sheepdog.sdc1
sheepdog  4180  0.0  0.0  34468   396 ?        Ss   21:41   0:00
/usr/sbin/sheep -n --port 7001 -z 1 /mnt/sdc1 --pidfile
/run/sheepdog/sheepdog.sdc1
sheepdog  4231  0.3  7.3 1863228 111700 ?      Sl   21:41   0:01
/usr/sbin/sheep -n --port 7002 -z 2 /mnt/sdd1 --pidfile
/run/sheepdog/sheepdog.sdd1
sheepdog  4232  0.0  0.0  34468   400 ?        Ss   21:41   0:00
/usr/sbin/sheep -n --port 7002 -z 2 /mnt/sdd1 --pidfile
/run/sheepdog/sheepdog.sdd1

Also dd stucks:
root      4326  0.2  0.3  14764  4664 pts/1    D+   21:44   0:01 dd
if=/dev/zero of=/mnt/test/zero bs=4M count=2000

There is in dmesg:

[ 6386.240000] INFO: rcu_sched self-detected stall on CPU { 0}  (t=6001
jiffies g=139833 c=139832 q=86946)
[ 6386.240000] sending NMI to all CPUs:
[ 6386.240000] NMI backtrace for cpu 0
[ 6386.240000] CPU: 0 PID: 4286 Comm: sbd_submiter Tainted: P
O 3.12.20-gentoo #1
[ 6386.240000] Hardware name: Gigabyte Technology Co., Ltd.
965P-S3/965P-S3, BIOS F14A 07/31/2008
[ 6386.240000] task: ffff88001cff3960 ti: ffff88002f78a000 task.ti:
ffff88002f78a000
[ 6386.240000] RIP: 0010:[<ffffffff811d4542>]  [<ffffffff811d4542>]
__const_udelay+0x12/0x30
[ 6386.240000] RSP: 0000:ffff88005f403dc8  EFLAGS: 00000006
[ 6386.240000] RAX: 0000000001062560 RBX: 0000000000002710 RCX:
0000000000000006
[ 6386.240000] RDX: 0000000001140694 RSI: 0000000000000002 RDI:
0000000000418958
[ 6386.240000] RBP: ffff88005f403de8 R08: 000000000000000a R09:
00000000000002bc
[ 6386.240000] R10: 0000000000000000 R11: 00000000000002bb R12:
ffffffff8149eec0
[ 6386.240000] R13: ffffffff8149eec0 R14: ffff88005f40d700 R15:
00000000000153a2
[ 6386.240000] FS:  0000000000000000(0000) GS:ffff88005f400000(0000)
knlGS:0000000000000000
[ 6386.240000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6386.240000] CR2: 00007f3bf6907000 CR3: 0000000001488000 CR4:
00000000000007f0
[ 6386.240000] Stack:
[ 6386.240000]  ffff88005f403de8 ffffffff8102d00a 0000000000000000
ffffffff814c43b8
[ 6386.240000]  ffff88005f403e58 ffffffff810967ac ffff88001bcb1800
0000000000000001
[ 6386.240000]  ffff88005f403e18 ffffffff81098407 ffff88002f78a000
0000000000000000
[ 6386.240000] Call Trace:
[ 6386.240000]  <IRQ>

[ 6386.240000]  [<ffffffff8102d00a>] ?
arch_trigger_all_cpu_backtrace+0x5a/0x80
[ 6386.240000]  [<ffffffff810967ac>] rcu_check_callbacks+0x2fc/0x570
[ 6386.240000]  [<ffffffff81098407>] ? acct_account_cputime+0x17/0x20
[ 6386.240000]  [<ffffffff810494d3>] update_process_times+0x43/0x80
[ 6386.240000]  [<ffffffff81082621>] tick_sched_handle.isra.12+0x31/0x40
[ 6386.240000]  [<ffffffff81082764>] tick_sched_timer+0x44/0x70
[ 6386.240000]  [<ffffffff8105dc4a>] __run_hrtimer.isra.29+0x4a/0xd0
[ 6386.240000]  [<ffffffff8105e415>] hrtimer_interrupt+0xf5/0x230
[ 6386.240000]  [<ffffffff8102b7f6>] local_apic_timer_interrupt+0x36/0x60
[ 6386.240000]  [<ffffffff8102bc0e>] smp_apic_timer_interrupt+0x3e/0x60
[ 6386.240000]  [<ffffffff8136aaca>] apic_timer_interrupt+0x6a/0x70
[ 6386.240000]  <EOI>

[ 6386.240000]  [<ffffffff811d577d>] ? __write_lock_failed+0xd/0x20
[ 6386.240000]  [<ffffffff813690f2>] _raw_write_lock+0x12/0x20
[ 6386.240000]  [<ffffffffa030579b>] sheep_aiocb_submit+0x2db/0x360 [sbd]
[ 6386.240000]  [<ffffffffa030544e>] ? sheep_aiocb_setup+0x13e/0x1b0 [sbd]
[ 6386.240000]  [<ffffffffa0304740>] 0xffffffffa030473f
[ 6386.240000]  [<ffffffff8105b510>] ? finish_wait+0x80/0x80
[ 6386.240000]  [<ffffffffa03046c0>] ? 0xffffffffa03046bf
[ 6386.240000]  [<ffffffff8105af4b>] kthread+0xbb/0xc0
[ 6386.240000]  [<ffffffff8105ae90>] ? kthread_create_on_node+0x120/0x120
[ 6386.240000]  [<ffffffff81369d7c>] ret_from_fork+0x7c/0xb0
[ 6386.240000]  [<ffffffff8105ae90>] ? kthread_create_on_node+0x120/0x120
[ 6386.240000] Code: c8 5d c3 66 0f 1f 44 00 00 55 48 89 e5 ff 15 ae 2d
2d 00 5d c3 0f 1f 40 00 55 48 8d 04 bd 00 00 00 00 65 48 8b 14 25 20 0d
01 00 <48> 8d 14 92 48 89 e5 48 8d 14 92 f7 e2 48 8d 7a 01 ff 15 7f 2d
[ 6386.240208] NMI backtrace for cpu 1
[ 6386.240213] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P           O
3.12.20-gentoo #1
[ 6386.240215] Hardware name: Gigabyte Technology Co., Ltd.
965P-S3/965P-S3, BIOS F14A 07/31/2008
[ 6386.240218] task: ffff88005d07b960 ti: ffff88005d09a000 task.ti:
ffff88005d09a000
[ 6386.240220] RIP: 0010:[<ffffffff8100b536>]  [<ffffffff8100b536>]
default_idle+0x6/0x10
[ 6386.240227] RSP: 0018:ffff88005d09bea8  EFLAGS: 00000286
[ 6386.240229] RAX: 00000000ffffffed RBX: ffff88005d09bfd8 RCX:
0100000000000000
[ 6386.240231] RDX: 0100000000000000 RSI: 0000000000000000 RDI:
0000000000000001
[ 6386.240234] RBP: ffff88005d09bea8 R08: 0000000000000000 R09:
0000000000000000
[ 6386.240236] R10: ffff88005f480000 R11: 0000000000000e1e R12:
ffffffff814c43b0
[ 6386.240238] R13: ffff88005d09bfd8 R14: ffff88005d09bfd8 R15:
ffff88005d09bfd8
[ 6386.240241] FS:  0000000000000000(0000) GS:ffff88005f480000(0000)
knlGS:0000000000000000
[ 6386.240243] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6386.240246] CR2: 00007f31fe05d000 CR3: 000000002a0f5000 CR4:
00000000000007e0
[ 6386.240247] Stack:
[ 6386.240249]  ffff88005d09beb8 ffffffff8100bc56 ffff88005d09bf18
ffffffff81073c7a
[ 6386.240253]  0000000000000000 ffff88005d09bfd8 ffff88005d09bef8
15e35c4b103d3891
[ 6386.240256]  0000000000000001 0000000000000001 0000000000000001
0000000000000000
[ 6386.240260] Call Trace:
[ 6386.240264]  [<ffffffff8100bc56>] arch_cpu_idle+0x16/0x20
[ 6386.240269]  [<ffffffff81073c7a>] cpu_startup_entry+0xda/0x1c0
[ 6386.240273]  [<ffffffff8102a1f1>] start_secondary+0x1e1/0x240
[ 6386.240275] Code: 21 ff ff ff 90 48 b8 00 00 00 00 01 00 00 00 48 89
07 e9 0e ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 48 89 e5
fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 fe 48 c7 c7 40 e7 58 81

[snip]



More information about the sheepdog mailing list