[sheepdog-users] 答复: sheepdog-users Digest, Vol 14, Issue 40

Wenhao Xu wenhao at zelin.io
Tue Jul 2 05:14:55 CEST 2013


Could you please paste your /var/log/cluster/corosync.log and sheep.log?

Thanks,
Wenhao



On Tue, Jul 2, 2013 at 10:59 AM, George Y. Hu <huyuanyuan at gamutsoft.com>wrote:

> Dears,
>
> I installed corosync(1.4.6)+sheepdog(0.6.0) on two Centos6, with the
> following configuration of corosync.conf
>
> -----------------------------------------
> compatibility: whitetank
>
> totem {
>         version: 2
>         secauth: off
>         threads: 0
>         interface {
>                 ringnumber: 0
>                 bindnetaddr: 10.86.213.251 (252 is another)
>                 mcastaddr: 226.94.1.1
>                 mcastport: 5405
>                 ttl: 1
>         }
> }
>
> logging {
>         fileline: off
>         to_stderr: no
>         to_logfile: yes
>         logfile: /var/log/cluster/corosync.log
>         to_syslog: yes
>         debug: off
>         timestamp: on
>         logger_subsys {
>                 subsys: AMF
>                 debug: off
>         }
> }
> --------------------------------------------
>
> When I start sheepdog service by "sheep /var/lib/sheep", it seems two nodes
> are not connected since I can see only one node in "collie node list",
> M   Id   Host:Port         V-Nodes       Zone
> -    0   10.86.213.251:7000     64  -69904886
>
> Iptables has been disabled but the problem remains.
> Will somebody help me on that?
>
>
> Best Regards,
>
> George Y. Hu
>
>
> Send sheepdog-users mailing list submissions to
>         sheepdog-users at lists.wpkg.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.wpkg.org/mailman/listinfo/sheepdog-users
> or, via email, send a message with subject or body 'help' to
>         sheepdog-users-request at lists.wpkg.org
>
> You can reach the person managing the list at
>         sheepdog-users-owner at lists.wpkg.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of sheepdog-users digest..."
>
>
> Today's Topics:
>
>    1. Re: Problem with snapshots made with qemu-img (Liu Yuan)
>    2. Crash khugepaged (Valerio Pachera)
>    3. Re: Crash khugepaged (Valerio Pachera)
>    4. Re: cluster format during recovery (MORITA Kazutaka)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 28 Jun 2013 18:02:59 +0800
> From: Liu Yuan <namei.unix at gmail.com>
> To: "Ing. Luca Lazzeroni - Trend Servizi Srl" <luca at gvnet.it>
> Cc: "sheepdog-users at lists.wpkg.org" <sheepdog-users at lists.wpkg.org>
> Subject: Re: [sheepdog-users] Problem with snapshots made with
>         qemu-img
> Message-ID: <20130628100259.GC13194 at ubuntu-precise>
> Content-Type: text/plain; charset=utf-8
>
> On Fri, Jun 28, 2013 at 10:07:47AM +0200, Ing. Luca Lazzeroni - Trend
> Servizi Srl wrote:
> > Hi,
> > if I make a snapshot of a running VM using:
> >
> > qemu-img snapshot -c Pippo Pluto.raw
> >
> > snapshot is created on all nodes, but its tag is updated on all nodes
> except the one running the VM.
> > On other nodes I can see, via "collie vdi list" the snapshot tag updated
> correctly, but on the node running the VM I see 2 VDI with the same name,
> different ID and empty Tag.
>
> Seems that recent qemu-img need fixes, we didn't test snapshot with
> qemu-img
> with our functonal tests. We should though.
>
> >
> > If I create the snapshot via "collie vdi snapshot", everything works fine
> and tag is propagated to all nodes; but I don't know if creating a snapshot
> with collie of a running VM with writeback cache enabled is a good idea in
> terms of data integrity?
>
> No problem, snapshot operation will
> 1 flush the cache first
> 2 mark the vdi as readonly
>
> If there is, it is a bug that should be fixed.
>
> Thanks,
> Yuan
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 28 Jun 2013 17:40:26 +0200
> From: Valerio Pachera <sirio81 at gmail.com>
> To: Lista sheepdog user <sheepdog-users at lists.wpkg.org>
> Subject: [sheepdog-users] Crash khugepaged
> Message-ID:
>         <CAHS0cb-KqoS6wWt_gT+bSQ56KS7Z5iA4yOSpX5zQsoGPX0WV=
> Q at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> What do you think about this?
>
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606691] khugepaged      D
> ffff88021f393780     0    32      2 0x00000000
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606696]  ffff880213793750
> 0000000000000046 ffffffff00000000 ffff880216566f60
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606701]  0000000000013780
> ffff880213795fd8 ffff880213795fd8 ffff880213793750
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606713]  ffff880213795730
> 0000000113795730 ffff88021657fe50 ffff88021f393fd0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606718] Call Trace:
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606727]
> [<ffffffff810b47b3>] ? lock_page+0x20/0x20
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606732]
> [<ffffffff8134da71>] ? io_schedule+0x59/0x71
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606737]
> [<ffffffff810b47b9>] ? sleep_on_page+0x6/0xa
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606740]
> [<ffffffff8134deb4>] ? __wait_on_bit+0x3e/0x71
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606744]
> [<ffffffff810b48f5>] ? wait_on_page_bit+0x6e/0x73
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606751]
> [<ffffffff8105fb09>] ? autoremove_wake_function+0x2a/0x2a
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606756]
> [<ffffffff810c2850>] ? shrink_page_list+0x166/0x73f
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606761]
> [<ffffffff810c9cfa>] ? zone_page_state_add+0x14/0x23
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606765]
> [<ffffffff810c0e13>] ? update_isolated_counts+0x13b/0x15a
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606769]
> [<ffffffff810c32c4>] ? shrink_inactive_list+0x2cd/0x3f0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606774]
> [<ffffffff810be232>] ? __lru_cache_add+0x2b/0x51
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606778]
> [<ffffffff810c3a89>] ? shrink_zone+0x3c0/0x4e6
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606783]
> [<ffffffff810c3fa7>] ? do_try_to_free_pages+0x1cc/0x41c
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606787]
> [<ffffffff810c4462>] ? try_to_free_pages+0xa9/0xe9
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606791]
> [<ffffffff810364e8>] ? should_resched+0x5/0x23
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606796]
> [<ffffffff810bb3ee>] ? __alloc_pages_nodemask+0x4ed/0x7aa
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606801]
> [<ffffffff8100d69f>] ? __switch_to+0x133/0x258
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606806]
> [<ffffffff8134eb77>] ? _raw_spin_unlock_irqrestore+0xe/0xf
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606811]
> [<ffffffff810e5f05>] ? alloc_pages_vma+0x12d/0x136
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606815]
> [<ffffffff810ce1c5>] ? pte_pfn+0x5/0xe
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606819]
> [<ffffffff810ef9bd>] ? khugepaged+0x4dc/0xef3
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606823]
> [<ffffffff8100d69f>] ? __switch_to+0x133/0x258
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606828]
> [<ffffffff8105fadf>] ? add_wait_queue+0x3c/0x3c
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606833]
> [<ffffffff810ef4e1>] ? add_mm_counter.constprop.28+0x9/0x9
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606837]
> [<ffffffff8105f48d>] ? kthread+0x76/0x7e
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606842]
> [<ffffffff81355cb4>] ? kernel_thread_helper+0x4/0x10
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606847]
> [<ffffffff8105f417>] ? kthread_worker_fn+0x139/0x139
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606851]
> [<ffffffff81355cb0>] ? gs_change+0x13/0x13
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606983] sheep           D
> ffff88021f393780     0 30859      1 0x00000000
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606987]  ffff880101d48730
> 0000000000000082 0000000000000000 ffff880216566f60
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606992]  0000000000013780
> ffff8802141dffd8 ffff8802141dffd8 ffff880101d48730
> Jun 28 16:34:20 sheepdog004 kernel: [103658.606997]  ffffea00048c4b20
> 0000000105019098 ffffea0004fdaaa8 ffff880214677be0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607001] Call Trace:
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607005]
> [<ffffffff8134eac4>] ? rwsem_down_failed_common+0xe0/0x114
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607011]
> [<ffffffff811b3af3>] ? call_rwsem_down_write_failed+0x13/0x20
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607015]
> [<ffffffff8134e431>] ? down_write+0x25/0x27
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607019]
> [<ffffffff810d543d>] ? sys_munmap+0x2e/0x52
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607023]
> [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607135] tar             D
> ffff88021f293780     0 14370  13938 0x00000000
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607139]  ffff88021472ae60
> 0000000000000086 ffffffff00000000 ffff8802165160c0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607144]  0000000000013780
> ffff880128b77fd8 ffff880128b77fd8 ffff88021472ae60
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607148]  ffffffff8101360a
> 00000001810660a1 ffff880213ff3f30 ffff88021f293fd0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607001] Call Trace:
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607005]
> [<ffffffff8134eac4>] ? rwsem_down_failed_common+0xe0/0x114
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607011]
> [<ffffffff811b3af3>] ? call_rwsem_down_write_failed+0x13/0x20
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607015]
> [<ffffffff8134e431>] ? down_write+0x25/0x27
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607019]
> [<ffffffff810d543d>] ? sys_munmap+0x2e/0x52
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607023]
> [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607135] tar             D
> ffff88021f293780     0 14370  13938 0x00000000
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607139]  ffff88021472ae60
> 0000000000000086 ffffffff00000000 ffff8802165160c0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607144]  0000000000013780
> ffff880128b77fd8 ffff880128b77fd8 ffff88021472ae60
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607148]  ffffffff8101360a
> 00000001810660a1 ffff880213ff3f30 ffff88021f293fd0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607153] Call Trace:
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607157]
> [<ffffffff8101360a>] ? read_tsc+0x5/0x14
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607161]
> [<ffffffff810b47b3>] ? lock_page+0x20/0x20
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607165]
> [<ffffffff8134da71>] ? io_schedule+0x59/0x71
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607169]
> [<ffffffff810b47b9>] ? sleep_on_page+0x6/0xa
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607172]
> [<ffffffff8134deb4>] ? __wait_on_bit+0x3e/0x71
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607176]
> [<ffffffff810b48f5>] ? wait_on_page_bit+0x6e/0x73
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607181]
> [<ffffffff8105fb09>] ? autoremove_wake_function+0x2a/0x2a
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607186]
> [<ffffffff810b49cd>] ? filemap_fdatawait_range+0x74/0x139
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607191]
> [<ffffffff810b6181>] ? filemap_write_and_wait+0x24/0x30
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607205]
> [<ffffffffa053ac73>] ? nfs_getattr+0x32/0xac [nfs]
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607211]
> [<ffffffff810fda17>] ? vfs_fstat+0x30/0x4e
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607214]
> [<ffffffff810fdb49>] ? sys_newfstat+0x12/0x2b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607218]
> [<ffffffff810fa376>] ? vfs_write+0xbb/0xe9
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607221]
> [<ffffffff810fa554>] ? sys_write+0x5f/0x6b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607225]
> [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607335] pgrep           D
> ffff88021f293780     0 30870  30869 0x00000000
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607339]  ffff880133e7c730
> 0000000000000086 0000000100000000 ffff8802165160c0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607344]  0000000000013780
> ffff8801340f5fd8 ffff8801340f5fd8 ffff880133e7c730
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607349]  0000000000000020
> 000000011f5fcc08 0000000000000002 ffff880214677be0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607353] Call Trace:
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607357]
> [<ffffffff8134eac4>] ? rwsem_down_failed_common+0xe0/0x114
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607361]
> [<ffffffff811b3ac4>] ? call_rwsem_down_read_failed+0x14/0x30
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607365]
> [<ffffffff8134e44a>] ? down_read+0x17/0x19
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607369]
> [<ffffffff810d1a94>] ? __access_remote_vm+0x3a/0x1c1
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607374]
> [<ffffffff810d2acb>] ? access_process_vm+0x48/0x65
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607378]
> [<ffffffff81140852>] ? proc_pid_cmdline+0x63/0xf0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607382]
> [<ffffffff81141a58>] ? proc_info_read+0x5b/0xb8
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607386]
> [<ffffffff810fa443>] ? vfs_read+0x9f/0xe6
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607390]
> [<ffffffff810fa4cf>] ? sys_read+0x45/0x6b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607393]
> [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607503] pgrep           D
> ffff88021f293780     0 30926  30925 0x00000000
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607507]  ffff880212ed2e20
> 0000000000000086 0000000100000000 ffff8802165160c0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607512]  0000000000013780
> ffff880132047fd8 ffff880132047fd8 ffff880212ed2e20
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607517]  0000000000000020
> 000000011f5fcc08 0000000000000002 ffff880214677be0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607521] Call Trace:
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607525]
> [<ffffffff8134eac4>] ? rwsem_down_failed_common+0xe0/0x114
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607529]
> [<ffffffff811b3ac4>] ? call_rwsem_down_read_failed+0x14/0x30
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607533]
> [<ffffffff8134e44a>] ? down_read+0x17/0x19
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607537]
> [<ffffffff810d1a94>] ? __access_remote_vm+0x3a/0x1c1
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607541]
> [<ffffffff810d2acb>] ? access_process_vm+0x48/0x65
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607545]
> [<ffffffff81140852>] ? proc_pid_cmdline+0x63/0xf0
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607548]
> [<ffffffff81141a58>] ? proc_info_read+0x5b/0xb8
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607552]
> [<ffffffff810fa443>] ? vfs_read+0x9f/0xe6
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607556]
> [<ffffffff810fa4cf>] ? sys_read+0x45/0x6b
> Jun 28 16:34:20 sheepdog004 kernel: [103658.607559]
> [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581543] Call Trace:
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581552]
> [<ffffffff810b47b3>] ? lock_page+0x20/0x20
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581557]
> [<ffffffff8134da71>] ? io_schedule+0x59/0x71
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581561]
> [<ffffffff810b47b9>] ? sleep_on_page+0x6/0xa
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581565]
> [<ffffffff8134deb4>] ? __wait_on_bit+0x3e/0x71
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581569]
> [<ffffffff810b48f5>] ? wait_on_page_bit+0x6e/0x73
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581575]
> [<ffffffff8105fb09>] ? autoremove_wake_function+0x2a/0x2a
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581581]
> [<ffffffff810c2850>] ? shrink_page_list+0x166/0x73f
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581586]
> [<ffffffff810c9cfa>] ? zone_page_state_add+0x14/0x23
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581591]
> [<ffffffff810c0e13>] ? update_isolated_counts+0x13b/0x15a
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581595]
> [<ffffffff810c32c4>] ? shrink_inactive_list+0x2cd/0x3f0
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581600]
> [<ffffffff810be232>] ? __lru_cache_add+0x2b/0x51
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581604]
> [<ffffffff810c3a89>] ? shrink_zone+0x3c0/0x4e6
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581608]
> [<ffffffff810c3fa7>] ? do_try_to_free_pages+0x1cc/0x41c
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581612]
> [<ffffffff810c4462>] ? try_to_free_pages+0xa9/0xe9
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581616]
> [<ffffffff810364e8>] ? should_resched+0x5/0x23
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581621]
> [<ffffffff810bb3ee>] ? __alloc_pages_nodemask+0x4ed/0x7aa
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581626]
> [<ffffffff8100d69f>] ? __switch_to+0x133/0x258
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581631]
> [<ffffffff8134eb77>] ? _raw_spin_unlock_irqrestore+0xe/0xf
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581636]
> [<ffffffff810e5f05>] ? alloc_pages_vma+0x12d/0x136
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581640]
> [<ffffffff810ce1c5>] ? pte_pfn+0x5/0xe
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581645]
> [<ffffffff810ef9bd>] ? khugepaged+0x4dc/0xef3
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581649]
> [<ffffffff8100d69f>] ? __switch_to+0x133/0x258
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581654]
> [<ffffffff8105fadf>] ? add_wait_queue+0x3c/0x3c
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581658]
> [<ffffffff810ef4e1>] ? add_mm_counter.constprop.28+0x9/0x9
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581662]
> [<ffffffff8105f48d>] ? kthread+0x76/0x7e
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581667]
> [<ffffffff81355cb4>] ? kernel_thread_helper+0x4/0x10
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581671]
> [<ffffffff8105f417>] ? kthread_worker_fn+0x139/0x139
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581675]
> [<ffffffff81355cb0>] ? gs_change+0x13/0x13
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581808] sheep           D
> ffff88021f393780     0 30859      1 0x00000000
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581813]  ffff880101d48730
> 0000000000000082 0000000000000000 ffff880216566f60
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581817]  0000000000013780
> ffff8802141dffd8 ffff8802141dffd8 ffff880101d48730
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581822]  ffffea00048c4b20
> 0000000105019098 ffffea0004fdaaa8 ffff880214677be0
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581827] Call Trace:
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581831]
> [<ffffffff8134eac4>] ? rwsem_down_failed_common+0xe0/0x114
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581842]
> [<ffffffff811b3af3>] ? call_rwsem_down_write_failed+0x13/0x20
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581846]
> [<ffffffff8134e431>] ? down_write+0x25/0x27
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581850]
> [<ffffffff810d543d>] ? sys_munmap+0x2e/0x52
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581854]
> [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581969] tar             D
> ffff88021f293780     0 14370  13938 0x00000000
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581974]  ffff88021472ae60
> 0000000000000086 ffffffff00000000 ffff8802165160c0
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581978]  0000000000013780
> ffff880128b77fd8 ffff880128b77fd8 ffff88021472ae60
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581983]  ffffffff8101360a
> 00000001810660a1 ffff880213ff3f30 ffff88021f293fd0
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581988] Call Trace:
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581992]
> [<ffffffff8101360a>] ? read_tsc+0x5/0x14
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581996]
> [<ffffffff810b47b3>] ? lock_page+0x20/0x20
> Jun 28 16:36:20 sheepdog004 kernel: [103778.581999]
> [<ffffffff8134da71>] ? io_schedule+0x59/0x71
> Jun 28 16:36:20 sheepdog004 kernel: [103778.582003]
> [<ffffffff810b47b9>] ? sleep_on_page+0x6/0xa
> ....
>
> Host with 8G of ram.
> The host was exporting also a nfs folder.
> Guest was mounting this folder.
> Guest for decompressing a big tar.gz (77G).
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 28 Jun 2013 18:02:37 +0200
> From: Valerio Pachera <sirio81 at gmail.com>
> To: Lista sheepdog user <sheepdog-users at lists.wpkg.org>
> Subject: Re: [sheepdog-users] Crash khugepaged
> Message-ID:
>         <CAHS0cb8TogSOD2pGE+TsScm+o=
> g1kEXGdUNoFWWF-xYoVgfwog at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> 2013/6/28 Valerio Pachera <sirio81 at gmail.com>:
> > What do you think about this?
>
> The crash was host side.
> It was difficult to interact with the host because pgrep, atop, pa
> aux, were freezing.
> 'top' and 'kill' were working.
> I had to kill -9 the guests.
> I've been able to reboot the host (and first shutdown the cluster).
> Collie node list was showing the host still inside the cluster.
>
> I wonder if the crash may be related to excessive network traffic on
> the nic, or it's related to the use of transparent huge pages.
> I set back the default value (madvide) but I'm not going to repeat the
> decompression via nfs today.
>
>
> ------------------------------
>
> Message: 4
> Date: Sat, 29 Jun 2013 12:50:06 +0900
> From: MORITA Kazutaka <morita.kazutaka at gmail.com>
> To: Valerio Pachera <sirio81 at gmail.com>
> Cc: Lista sheepdog user <sheepdog-users at lists.wpkg.org>
> Subject: Re: [sheepdog-users] cluster format during recovery
> Message-ID: <m27ghd60ep.wl%morita.kazutaka at gmail.com>
> Content-Type: text/plain; charset=US-ASCII
>
> At Thu, 27 Jun 2013 15:45:36 +0200,
> Valerio Pachera wrote:
> >
> > This is an unusual thing.
> > It's useful for testing purpose only:
> >
> > What happens if cluster format is run during a recovery?
>
> Probably, the recovery process will print a lot of error messages
> after cluster format since it cannot find any objects to be recovered.
>
> Thanks,
>
> Kazutaka
>
>
> ------------------------------
>
> _______________________________________________
> sheepdog-users mailing list
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users
>
>
> End of sheepdog-users Digest, Vol 14, Issue 40
> **********************************************
>
>
> --
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20130702/d2a7143c/attachment-0005.html>


More information about the sheepdog-users mailing list