On Wed, 2013-04-10 at 08:17 -0400, Anthony Sheetz wrote: > Steps to reproduce: > Install Debian Testing from Netinstall CD, amd64. > Choose LVM and Full Disk Encryption, with a separate /home > Resize /home to be 80GB > Install openswan, connect to remote network > Install xen > Set up a virtual machine with Debian Stable using logical volumes as the > backing store. > fs: ext3 > network: NAT > transfer a large (multigigabyte) file from a remote server over the internet > to the virtual machine > > Expected behavior: File transfers fine, md5sum agrees with remote system > Observed behavior: md5sum never matches, done enough times, the ext3 fs > becomes corrupted
Can I just confirm a few things please: The VM disk backend is an LVM volume which is included in the full disk encryption? I suppose it is using dm-crypt? The ext3 fs which becomes corrupted is the guest VM filesystem, not the dom0 filesystem nor a filesystem which is is what the the large multigigabyte file which is transferred over the network consists of? On the face of it it sounds to me like the network corruption (md5sum issue) and the eventual ext3 corruption must be separate issues. Or I suppose it is possible that the file is received correctly but is corrupted when written to the disk, but it's probably better to consider them separately until we know one way or the other. WRT the file transfer corruption: Is the file being transferred over the openswan link? Did you ever happen to try a transfer over a non-tunnelled connection? Were you able to successfully transfer the file to the dom0 filesystem or to any other system (e.g. one not running Xen) on this end of the openswan link? I'm not sure what error detection/correction scp/rsync or if they have any additional verification options which could be tried or perhaps it is possible to run md5sum on the stream before it hits the disk (can one rsync/scp to stdout? I doubt it). If you can transfer to dom0 OK then it might be interesting to try turning off the various offloads (GSO, SG etc) on the vif link. WRT the filesystem corruption: How did the ext3 corruption manifest itself? I wonder if the layering of crypto+lvm+xen-blkback is causing the barriers which ext3 requires to function correctly to not occur in the right places. Does something need to be manually configured to enable barriers at some layer? (or perhaps I am thinking of DISCARD support). If you were able to attempt to reproduce without the crypto bit in dom0 for the VM disk that would be really useful. It might also be interesting to try using the ext3 barrier mount option in the guest to switch barriers either off or on (I can't remember what the default was for Squeeze). I appreciate that you may have redeployed/downgraded the systems so some of the above experiments might be quite hard to try out but if you could setup a spare system or something it would be very much appreciated. Ian. -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org