[Sheepdog] [PATCH 0/2] use corosync for the cluster communication

Chris Webb chris at arachsys.com
Sat Nov 14 16:45:25 CET 2009


Chris Webb <chris at arachsys.com> writes:

> So far I've successfully created an image, booted a live CD in qemu, made a
> filesystem on my image, and I'm just about to do a bit of general stress
> testing on it.

Hi. I think I'm seeing some data corruption with block device access from
qemu. I made a sheepdog backed ext3 filesystem inside a qemu with a live CD,
mounted it at /dst and tried unpacking a kernel source tree onto it. Once
this had finished, I did

  find /dst -type f >/dev/null

and saw a lot of filesystem errors like

  EXT3-fs error (device hda): htree_dirblock_to_tree: bad entry in directory #148880: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0


I've repeated this a couple of times, including with an image size of only
1G, to eliminate the possibility that the bug is an 32-bit int offset
overflow with larger images. Usually the filesystem won't even mount after
the mke2fs because of a corrupt journal:

  livecd ~ # mke2fs -j /dev/hda
  mke2fs 1.40.8 (13-Mar-2008)
  /dev/hda is entire device, not just one partition!
  Proceed anyway? (y,n) y
  Warning: 256-byte inodes not usable on older systems
  Filesystem label=
  OS type: Linux
  Block size=4096 (log=2)
  Fragment size=4096 (log=2)
  65536 inodes, 262144 blocks
  13107 blocks (5.00%) reserved for the super user
  First data block=0
  Maximum filesystem blocks=268435456
  8 block groups
  32768 blocks per group, 32768 fragments per group
  8192 inodes per group
  Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376

  Writing inode tables: done                            
  Creating journal (8192 blocks): done
  Writing superblocks and filesystem accounting information: done

  This filesystem will be automatically checked every 36 mounts or
  180 days, whichever comes first.  Use tune2fs -c or -i to override.

  livecd ~ # mkdir /dst
  livecd ~ # mount /dev/hda /dst
  mount: wrong fs type, bad option, bad superblock on /dev/hda,
         missing codepage or helper program, or other error
         In some cases useful info is found in syslog - try
         dmesg | tail  or so

  livecd ~ # dmesg
  [...]
  ACPI: Power Button (FF) [PWRF]
  FDC 0 is a S82078B
  warning: process `hwsetup' used the deprecated sysctl system call with 1.49.
  warning: process `hwsetup' used the deprecated sysctl system call with 1.49.
  No dock devices found.
  hda: MWDMA2 mode selected
  hdc: MWDMA2 mode selected
  eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
  JBD: no valid journal superblock found
  EXT3-fs: error loading journal.

I've checked the (small) btrfs I'm using to back sheepdog, but it isn't even
20% full yet, so that presumably isn't to blame. Doing the same mke2fs test on
a qcow2 or raw image inside the same btrfs doesn't reproduce the problem.

I've also demonstrated corruption using dd of a chunk of data to the middle
part of a clean vdi: the md5sum of the data isn't preserved.

The corruption exists with --copies=3 and --copies=1, and with one or more
sheep/puppy pairs.

Finally, when I try

  # ./qemu-img convert -O sheepdog /tmp/livecd-amd64.iso CD

this does seem to work correctly: attaching the resulting CD image to a qemu
with

  # x86_64-softmmu/qemu-system-x86_64 -vnc :1 \
      -drive format=sheepdog,file=CD,media=cdrom

worked very nicely. Perhaps only sparsely written vdis are affected?

Cheers,

Chris.



More information about the sheepdog mailing list