On Mon, Apr 16, 2012 at 11:37 AM, Prateek Sharma <[email protected]> wrote: > On Mon, Apr 16, 2012 at 3:35 PM, Stefan Hajnoczi <[email protected]> wrote: >> On Sat, Apr 14, 2012 at 2:31 PM, Prateek Sharma <[email protected]> >> wrote: >>> I am writing to ask whether there is any possibility of silent data >>> corruption with virtIO/qemu. >>> My files inside the guest are getting zeroed out. Specifically, bytes >>> 512 to 4096 on some files are being zeroed out. (some 60K files out of >>> 5Million are showing this so far). The virtual disk is a file on >>> ext4 (size of file is 1TB), and i am using virtio, aio-threads=native, >>> and cache=none. The disk having the virtual disk is on mdadm >>> raid1---so i am guessing possibility of disk corruption is low >> >> Did you check the corruption by mounting the image file on the host >> while the VM is not running? This we you can be sure that there is no >> bug that causes the guest to "see" zeroes. If you see the zeroes >> inside the guest we can't be 100% sure that the image file itself >> contains them. >> >> Can you describe the guest configuration? Guest operating system? >> Application/mail server? Which files are being corrupted? Is I/O >> constantly being issued by the guest or does the corruption appear >> even when the files are not being accessed? Basically anything that >> can help spot a pattern here. >> >> Stefan > > Hi Stefan, > > *The problem seems to have gone-away after having upgraded the host > and guest kernel (3.2.2) and qemu to 1.0* > > The files were being corrupted on disk as well (checked with mounting > the disk image via qemu-nbd). The guest is ubuntu-10.04.3 x86 server > (2.6.32 pae kernel) . It's a mail-server, with Postfix and > Dovecot(1.2.6). The files being corrupted are maildir files. > Accessing the files seems to increase the probability of corruption, > although i cannot prove a definite correlation. I could not spot a > pattern in the file corruption, other than the weird 512-4096 bytes > being zeroed out for the files. The block-size everywhere is 4k, so i > wonder how this could possibly happen. > > I am guessing it was probably some kernel bug that caused this in the > first place?
Maybe. I'm afraid there's not enough information to tell what is going on. If you do see this again we can try to debug it. Stefan
