[Sheepdog] Question about get_obj_list()
morita.kazutaka at lab.ntt.co.jp
Thu Sep 15 05:14:01 CEST 2011
At Wed, 14 Sep 2011 14:32:39 +0800,
Liu Yuan wrote:
> I am writing something that can gets object distribution stat for
> specified image like
> dev at taobao:~/sheepdog$ collie/collie vdi object tailai.ly --stat
> node number of objects
> 192.168.0.1:7000 96
> 192.168.0.2:7000 95
> 192.168.0.3:7000 97
Sounds nice. :)
> In the process, I found a bug in get_obj_list(), which would result
> in sheep aborting when handling SD_OP_GET_OBJ_LIST. I traced and found
> the culprit was 'buf' that was used to serve as a buffer for object
> list, zalloced from sheep's heap. The problem is, the metadata that
> gcc's malloc implementation reserved for 'buf' would sometimes get
> corrupted and following 'free(buf)' would cause
> *** glibc detected *** sheep/sheep: double free or corruption (out)
> or similar problem and sheep process terminated.
> From my personal understanding of the code, get_obj_list() serves
> to return a list of *targeted* objects to the requester. The
> patch[sheep: remove object list file] changed its logic a bit, and there
> is a loop that
> iterates from epoch 1 to epoch n, to merge all the object it finds.
> I am not sure which line of code overrun the 'buf', but when I
> remove the for loop, and just return object list
> from one targeted epoch, I have no longer seen the problem.
Oops, I'll take a look at this issue.
> So my question is, what is idea behind the for loop? Because
> SD_OP_GET_OBJ_LIST request is served when
> the node is active (agree on the epoch that other nodes can see), so the
> targeted epoch exists for sure when serving the request. Actually, old
> objects in the old epoch need to be cleaned up in my opinion. So why bother
> searching and get list from them? I think they are simply stale hardlinks.
It is because to handle multiple node failure. If Sheepdog increments
a epoch number before finishing object recovery, the latest epoch
directory will not have all objects it should have. To handle this
problem, in the previous Sheepdog, we created a object list file just
after updating epoch, but there were some problem about it. I think
the simplest way to create a correct object list is searching all the
object in the Sheepdog cluster, and this is the reason of looping.
More information about the sheepdog