[sheepdog-users] 答复: 答复: Help? Creeping Errors "no inode has ..." with 0.9.1

Wed Jan 28 05:26:27 CET 2015

At Wed, 28 Jan 2015 10:23:46 +0800,
redtone wrote:
> 
> 
> Thanks for Hitoshi's confirmation 
> 
> Is the new algorithm of VID recycling  compatible with the v 0.9.1? 

Unfortunately, no. It cannot backported to v0.9.x.

Thanks,
Hitoshi

> 
> 
> At Wed, 28 Jan 2015 08:19:37 +0800,
> redtone wrote:
> > 
> > [1  <multipart/related (7bit)>]
> > [1.1  <multipart/alternative (7bit)>]
> > [1.1.1  <text/plain; gb2312 (quoted-printable)>]
> > Please make sure the recycle VDI is disabled for v 0.9.1
> > 
> >  
> > 
> > Daily snapshot will remove the old snapshot and create a new one.
> > 
> > If recycle VDI is enabled, when the snapshot is removed, the assigned VDI
> > will be deleted. And so the data is lost. 
> > 
> >  
> > 
> > This bug is fixed in v 0.7.1. but I am not sure in v 0.9.1.
> 
> The bug is already solved in v0.9.x. So Thronton's problem isn't
> related to it.
> 
> # BTW, the master branch has correct algorithm of VID recycling
> 
> Thanks,
> Hitoshi
> 
> > 
> >  
> > 
> >  
> > 
> >   _____  
> > 
> > $A7"$B7o?M(B: sheepdog-users [mailto:sheepdog-users-bounces at lists.wpkg.
> org] $BBeI=(B
> > Thornton Prime
> > $A7"$BAw$AJ1<d(B: 2015$BG/(B1$B7n(B27$BF|(B 23:50
> > $BZ at 7o?M(B: Hitoshi Mitake
> > $B>6Aw(B: Lista sheepdog user
> > $B<g$ALb(B: Re: [sheepdog-users] Help? Creeping Errors "no inode has
> ..." with
> > 0.9.1
> > 
> >  
> > 
> > Thanks. I have been using cache -- so if that is unstable that would
> explain
> > a lot. I'm disabling cache to see how much that helps.
> > 
> > Attached is a dog cluster info. I have a few MB of logs ... I'll see where
> I
> > can post them to get the
> > 
> > I am seeing a strong correlation between snapshots and the corrupted VDIs.
> > All the VDIs that have missing inodes are part of a daily snapshot
> schedule.
> > All the VDIs that are not part of the snapshot schedule are fine. All the
> > nodes have object cache enabled.
> > 
> > Thanks ... I'll see if I can collect more data and reproduce the problem
> > more consistently.
> > 
> > ~ thornton prime
> > 
> > 
> > 
> > 
> > 
> > 
> >  <mailto:mitake.hitoshi at lab.ntt.co.jp> Hitoshi Mitake
> > 
> > January 26, 2015 at 8:17 PM
> > 
> > At Mon, 26 Jan 2015 07:11:29 -0800,
> > Thornton Prime wrote:
> > 
> > I've been getting increasing errors in my logs that "failed No object
> > found, remote address: XXXXXXX:7000, op name: READ_PEER" and then
> > corresponding errors that "no inode has ...." when I do a cluster check.
> > 
> >  
> > Could you provide detailed logs and an output of "dog cluster info"?
> >  
> > 
> > At the beginning of last week I had no errors, and over the course of a
> > week it grew to be one VDI missing some hundred inodes, and now it is
> > multiple VDIs each missing hundreds of objects.
> >  
> > I haven't seen any issues with the underlying hardware, disks, or
> > zookeeper on the nodes in the course of the same time.
> >  
> > What is causing this data loss? How can I debug it? How can I stem it?
> > Any chances I can repair the missing inodes?
> >  
> > I have 5 sheepdog storage nodes, also running Zookeeper. I have another
> > 8 "gateway only" nodes that are part of the node pool, but only running
> > a gateway and cache.
> > 
> >  
> > Object cache (a functionality which can be activated with -w option of
> > sheep) is quite unstable. Please do not use it for serious purpose.
> >  
> > Thanks,
> > Hitoshi
> > 
> > 
> > 
> >  <mailto:thornton.prime at gmail.com> Thornton Prime
> > 
> > January 26, 2015 at 7:11 AM
> > 
> > I've been getting increasing errors in my logs that "failed No object
> > found, remote address: XXXXXXX:7000, op name: READ_PEER" and then
> > corresponding errors that "no inode has ...." when I do a cluster check.
> > 
> > At the beginning of last week I had no errors, and over the course of a
> > week it grew to be one VDI missing some hundred inodes, and now it is
> > multiple VDIs each missing hundreds of objects.
> > 
> > I haven't seen any issues with the underlying hardware, disks, or
> > zookeeper on the nodes in the course of the same time.
> > 
> > What is causing this data loss? How can I debug it? How can I stem it?
> > Any chances I can repair the missing inodes?
> > 
> > I have 5 sheepdog storage nodes, also running Zookeeper. I have another
> > 8 "gateway only" nodes that are part of the node pool, but only running
> > a gateway and cache.
> > 
> > I have about dozen VDI images, and they've been fairly static for the
> > last week while I've been testing -- not a lot of write activity.
> > 
> > ~ thornton
> > 
> > [1.1.2  <text/html; gb2312 (quoted-printable)>]
> > 
> > [2  <text/plain; us-ascii (7bit)>]
> > -- 
> > sheepdog-users mailing lists
> > sheepdog-users at lists.wpkg.org
> > https://lists.wpkg.org/mailman/listinfo/sheepdog-users
> 
> 
> 
> -- 
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog-users