<div dir="ltr"><div><div><div><div><br></div>Hi all<br><br></div>I know this is a bit late, but I think I figured out the issue. I had forgotten that I had experimented with creating a libvirt pool for sheepdog. When rebooting that machine, virtual machines would not start until sheepdog was close to or fully recovered, even though they were not actually using the pool, which was empty. After deleting the pool, the problem seems to have gone away. While using a pool in libvirt seems like it would be a good way to manage my virtual machines, allowing easier migration, I will stay away from it for now.<br><br></div>thanks<br></div>Philip<br><div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Oct 8, 2014 at 2:06 AM, Liu Yuan <span dir="ltr"><<a href="mailto:namei.unix@gmail.com" target="_blank">namei.unix@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Wed, Oct 08, 2014 at 02:00:01PM +0800, Liu Yuan wrote:<br>
> On Sun, Oct 05, 2014 at 03:09:10PM +0900, Hitoshi Mitake wrote:<br>
> > On Fri, Oct 3, 2014 at 3:49 PM, Valerio Pachera <<a href="mailto:sirio81@gmail.com">sirio81@gmail.com</a>> wrote:<br>
> > > Are you able to tell us if the node was accessing the disks mostly reading<br>
> > > or writing?<br>
> > > You can see that by atop during recovery.<br>
> > ><br>
> > > @Histoshi, I remember sheepdog was transfering all data during recovery in<br>
> > > old versions.<br>
> > > Later on checksumming was introduced.<br>
> > > Does 0.7.5 already use checksum?<br>
> ><br>
> > It wouldn't related to checksuming. 0.7.x tends to transfer larger<br>
> > amount of data than 0.8.x. But stopping requests from VMs completely<br>
> > is really strange because sheepdog prioritizes the requests from VMs<br>
> > (as Andrew mentioned).<br>
><br>
> This is the very reason why I added multi-threaded recovery. The problem looks<br>
> very close to the one that we found in the past.<br>
><br>
> "<br>
> * Rationale for multi-threaded recovery:<br>
> * 1. If one node is added, we find that all the VMs on other nodes will<br>
> * get noticeably affected until 50% data is transferred to the new<br>
> * node.<br>
> "<br>
><br>
> Especially for adding new node, there is big chance that all the VMs has their<br>
> data scatted on the new node and there is only a single thread for data recovery<br>
><br>
> As you noticed we priorities VM requests over recovery requests, but for the<br>
> case (adding new node), image that<br>
><br>
> 1 there is no data on the new node<br>
> 2 all the VMs try to read/write its slice of data on the new node, there might<br>
> be thousands of requests proportional to the number of VMs.<br>
> 3 before we can read/write on the targeted objects, we need to transfer/rebuild<br>
> them first.<br>
> 4 so the bottle neck is how fast we can transfer/rebuild the targeted objects in<br>
> this new node.<br>
> 5 unfortunately, there is single thread to recover the targeted objects because<br>
> this is one of means that how we priorities the VM IOs over reocery requests.<br>
><br>
> For now only master and 0.9 series will have this feature.<br>
><br>
> If we have old data on the joining node, we also need first check if the<br>
> exsiting data is stale or not. The checking process is part of reocery algorithm<br>
> and the check speed can only be accelerated by multi-threaded recovery.<br>
><br>
> We statically allocate 2 threads for one disk device, so make use you have<br>
> sheep md feature enabled and use multi disks, so that we can have enough threads<br>
> to handle the requests flood at the early stage of reocery process and VMs will<br>
> be more responsive.<br>
<br>
</div></div>We priorities the VM IO, meaning that if the target objects the VMs access is<br>
existing on the targeted node, we make sure executing the VMs reqs and blocking<br>
sheep internal requests that try to reocver other objects.<br>
<br>
The problem is, the target objects are unfortunately the objects we try to<br>
recover and *there is only one thread to recoover the objects* and blocks the<br>
VM IO in return.<br>
<br>
Thanks,<br>
<div class="HOEnZb"><div class="h5">Yuan<br>
--<br>
sheepdog-users mailing lists<br>
<a href="mailto:sheepdog-users@lists.wpkg.org">sheepdog-users@lists.wpkg.org</a><br>
<a href="http://lists.wpkg.org/mailman/listinfo/sheepdog-users" rel="noreferrer" target="_blank">http://lists.wpkg.org/mailman/listinfo/sheepdog-users</a><br>
</div></div></blockquote></div><br></div>