[sheepdog] [PATCH v2 3/3] object list cache: reclaim object list cache when receiving a deletion event.

levin li levin108 at gmail.com
Wed Jul 18 10:05:46 CEST 2012


On 07/18/2012 03:49 PM, MORITA Kazutaka wrote:
> At Wed, 18 Jul 2012 15:25:16 +0800,
> levin li wrote:
>>>>> I'm still a bit confused why we need both objlist_cache_delete and
>>>>> objlist_cache_remove.  If no recovery happens during object deletion,
>>>>> we don't need objlist_cache_delete because objlist_cache_remove
>>>>> removes the entry.  objlist_cache_delete is necessary for the case
>>>>> object deletion and recovery happen at the same time to clean up
>>>>> unhandled entries by objlist_cache_remove.  Is it correct?
>>>>>
>>>>
>>>>
>>>> Well, if no recovery happens before, objlist_cache_remove() can always
>>>> delete the entry in object list cache while VDI deletion, but once recovery
>>>> happens, it's something different.
>>>>
>>>> For example, obj A and B stay in node n1 at first, but after a recovery,
>>>> A may be migrated from n1 to n2, and B stays in n1, but the objlist entry
>>>
>>> Is there any reason we don't remove the objlist entry when we unlink
>>> the object from the farm working directory?
>>>
>>> diff --git a/sheep/farm/trunk.c b/sheep/farm/trunk.c
>>> index b45427d..e390a63 100644
>>> --- a/sheep/farm/trunk.c
>>> +++ b/sheep/farm/trunk.c
>>> @@ -297,6 +297,7 @@ int trunk_file_write_recovery(unsigned char *outsha1)
>>>  				goto out;
>>>  			}
>>>  			dprintf("remove file %"PRIx64"\n", entry->raw.oid);
>>> +			objlist_cache_remove(oid);
>>>  			put_entry(entry);
>>>  		}
>>>  	}
>>>
>>> Recovery algorithm has been changed from before, so probably I'm
>>> missing something.  Correct me if I'm wrong.
>>>
>>
>> We does call objlist_cache_remove() when deleting a specified object, but
>> objlist_cache_remove() can only delete the specified objlist entry on local
>> node, but as I explained before the objlist may not stay on the local node
>> because of recovery migrating the object without migrating the objlist entry.
> 
> After migrating the objects to other nodes, they will be removed from
> the farm working directory.  If we remove the objlist entry when
> removing the objects from the working directory, there is no objlist
> entry whose oid doesn't exists on local node, isn't it.

That is the problem, we can't remove the objlist entry as soon as we remove the
object from farm working directory, if node A end recovery before some other nodes,
and then it delete the stale object from the farm working directory, but if it
also delete the objlist entry, it may cause problem, try thinking of another node B
which issues a get_obj_list() request after the objlist entry is deleted on the
original node A, but still not added to the target node C, then node B would not find
the objlist entry, then for node B, this object is ignored to recovery, so it's lost.


> 
> Objlist cache should be a cache of oid list on the local farm working
> directory, no?  I'm confused with what the objlist cache is caching.
> 

It does be the oid list, thinking of the time without object list cache, no matter simple
store or farm never delete the stale object, so no such problems before, but as we cleanup
the working directory, the problem comes, that's another reason a objlist cache is needed.

thanks,

levin

> Thanks,
> 
> Kazutaka
> 





More information about the sheepdog mailing list