[sheepdog] [PATCH v2 3/3] object list cache: reclaim object list cache when receiving a deletion event.

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed Jul 18 15:23:56 CEST 2012


At Wed, 18 Jul 2012 16:05:46 +0800,
levin li wrote:
> 
> On 07/18/2012 03:49 PM, MORITA Kazutaka wrote:
> > At Wed, 18 Jul 2012 15:25:16 +0800,
> > levin li wrote:
> >>>>> I'm still a bit confused why we need both objlist_cache_delete and
> >>>>> objlist_cache_remove.  If no recovery happens during object deletion,
> >>>>> we don't need objlist_cache_delete because objlist_cache_remove
> >>>>> removes the entry.  objlist_cache_delete is necessary for the case
> >>>>> object deletion and recovery happen at the same time to clean up
> >>>>> unhandled entries by objlist_cache_remove.  Is it correct?
> >>>>>
> >>>>
> >>>>
> >>>> Well, if no recovery happens before, objlist_cache_remove() can always
> >>>> delete the entry in object list cache while VDI deletion, but once recovery
> >>>> happens, it's something different.
> >>>>
> >>>> For example, obj A and B stay in node n1 at first, but after a recovery,
> >>>> A may be migrated from n1 to n2, and B stays in n1, but the objlist entry
> >>>
> >>> Is there any reason we don't remove the objlist entry when we unlink
> >>> the object from the farm working directory?
> >>>
> >>> diff --git a/sheep/farm/trunk.c b/sheep/farm/trunk.c
> >>> index b45427d..e390a63 100644
> >>> --- a/sheep/farm/trunk.c
> >>> +++ b/sheep/farm/trunk.c
> >>> @@ -297,6 +297,7 @@ int trunk_file_write_recovery(unsigned char *outsha1)
> >>>  				goto out;
> >>>  			}
> >>>  			dprintf("remove file %"PRIx64"\n", entry->raw.oid);
> >>> +			objlist_cache_remove(oid);
> >>>  			put_entry(entry);
> >>>  		}
> >>>  	}
> >>>
> >>> Recovery algorithm has been changed from before, so probably I'm
> >>> missing something.  Correct me if I'm wrong.
> >>>
> >>
> >> We does call objlist_cache_remove() when deleting a specified object, but
> >> objlist_cache_remove() can only delete the specified objlist entry on local
> >> node, but as I explained before the objlist may not stay on the local node
> >> because of recovery migrating the object without migrating the objlist entry.
> > 
> > After migrating the objects to other nodes, they will be removed from
> > the farm working directory.  If we remove the objlist entry when
> > removing the objects from the working directory, there is no objlist
> > entry whose oid doesn't exists on local node, isn't it.
> 
> That is the problem, we can't remove the objlist entry as soon as we remove the
> object from farm working directory, if node A end recovery before some other nodes,
> and then it delete the stale object from the farm working directory, but if it
> also delete the objlist entry, it may cause problem, try thinking of another node B
> which issues a get_obj_list() request after the objlist entry is deleted on the
> original node A, but still not added to the target node C, then node B would not find
> the objlist entry, then for node B, this object is ignored to recovery, so it's lost.

That brings me another question.  The farm driver may remove objects
from the working directory before they are migrated to other nodes.
Why is it not a problem.

> 
> 
> > 
> > Objlist cache should be a cache of oid list on the local farm working
> > directory, no?  I'm confused with what the objlist cache is caching.
> > 
> 
> It does be the oid list, thinking of the time without object list cache, no matter simple
> store or farm never delete the stale object, so no such problems before, but as we cleanup
> the working directory, the problem comes, that's another reason a objlist cache is needed.

Thank you, I understand why this patch is necessary.  Shouldn't we
document this on the wiki or in the source code?  I think the reason
why objlist cache must contain stale objects is not obvious without
explanation.

Thanks,

Kazutaka



More information about the sheepdog mailing list