[Sheepdog] [PATCH 1/2] deleting data objects of a vdi before deleting the inode

Tue May 1 13:49:57 CEST 2012

On 05/01/2012 07:14 PM, Christoph Hellwig wrote:

> On Tue, May 01, 2012 at 06:56:06PM +0800, Liu Yuan wrote:
>> I guess you might misunderstand the patch. The deleting VDI operation is
>> very hard for a distributed cluster, which is subject to node failures.
>> Current async deleting makes it harder to get it right.
>>
>> before the path:
>> 1) inode is deleted before data object
>> 2) if failure happens, we don't have any means to try delete again since
>> inode is deleted.
>>
>> This patch set aims to
>> 1) delete data object before inode object
>> 2) when the failure happens (some data object is migrated thus
>> err-return), we still have the name in vdi list output with size 0,
>> which means last deleting is failed and we can try delete it again.
>> 3) if deleting success, we don't see it in VDI list output.
> 
> I don't think it should sporadically fail.  Especially not for the
> trivial testcase I posted earlier.

Nope, it fails because of *your* last wrong patch (cleanup nr_copies),
that is why I came up with 'sheep: fix nr_copies in vdi.c', which aims
to restore previous logic. [I haven't merged it yet, waiting for your
agreement]

Please try my patch and then do the above test.

I have just tried following test and it works as expected with my patch.

tailai.ly at taobao:~/sheepdog$ ./test.sh
using backend farm store
  Name        Id    Size    Used  Shared    Creation time   VDI id  Tag
  test1        1  1.0 GB  0.0 MB  0.0 MB 2012-05-01 19:42   fd32fc
  test0        1  1.0 GB  0.0 MB  0.0 MB 2012-05-01 19:42   fd34af
  Name        Id    Size    Used  Shared    Creation time   VDI id  Tag

script:
#!/bin/bash

dd if=/dev/zero of=data count=1 bs=100M
for i in 0 1 2 3; do sheep/sheep -a -d /home/tailai.ly/sheepdog/store/$i
-z $i -p 700$i;sleep 1;done
collie/collie cluster format -b farm
collie/collie vdi create test0 1G
collie/collie vdi create test1 1G
collie/collie vdi list
collie/collie vdi write test0 0 10M < data -p 7000
collie/collie vdi write test1 0 20M < data -p 7001
collie/collie vdi delete test0
collie/collie vdi delete test1
collie/collie vdi list