[Sheepdog] [PATCH 2/2] make vdi setattr atomic

Fri Oct 14 13:58:19 CEST 2011

At Fri, 14 Oct 2011 11:39:47 +0100,
Chris Webb wrote:
> 
> MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:
> 
> > Before pulling vdiattr branch, didn't this bug happen?
> 
> I think we never got far enough in the process to try the write because the
> setattr -x/getattr stuff failed beforehand, so I can't be sure it hasn't
> happened all along.
> 
> > If possible, can you check what is written in sheep.log when this problem
> > happens?
> 
> I've put the three log files up at
> 
>   http://cdw.me.uk/tmp/sheep-00.log
>   http://cdw.me.uk/tmp/sheep-01.log
>   http://cdw.me.uk/tmp/sheep-02.log
> 
> They're very short as I created the cluster afresh and immediately ran the
> commands that triggered the problem.

Thanks, the reason of this problem is that you use a direct I/O option
but the offset and length of "collie vdi write" is not aligned to
sector size (512 bytes).  I didn't expect that because VM's I/O
requests are always sector aligned.

Is it okay to exit with error when the offset size is not aligned to
512 bytes?  And is it okay to enlarge the read/write buffer length to
the sector aligned size when it is not aligned?  If possible, I don't
want to treat "collie vdi read/write" as special cases.

> 
> > >   0026# collie vdi list
> > >   name        id    size    used  shared    creation time   vdi id
> > >   ------------------------------------------------------------------
> > >   Floating point exception (core dumped)
> > 
> > Can you get a stack trace from the core?
> 
> This is a small collie interface bug I've seen before and meant to fix
> myself but hadn't got around to: it's a division by zero in hval_to_sheep()
> (line 205 of include/sheep.h). You do
> 
>         ret = get_nth_node(entries, nr_entries, (i + 1) % nr_entries, idx);
> 
> which is a division-by-zero if nr_entries = 0, i.e. where all the nodes have
> gone away as in this case. (There aren't any nodes to pick from in that
> case, so this should fail but not dump core!)

Thanks!  I'll fix it in the next patchset too.

Kazutaka