[Qemu-devel] e1000 memory corruption in guest OS

Hoyer, David Sat, 15 Feb 2014 21:22:28 -0800

We are using Qemu-1.7.0 with Xen-4.3.0 and Debian jessie.   We are noticing 
that when we transfer large files from our network to the guestOS via the e1000 
virtual network device that we experience memory corruption on the guestOS.   
We have debugged this problem and have determined where it appears that the 
corruption is happening and have created a patch file with a fix (at least the 
corruption is no longer happening on our guestOS anymore).     Note that our 
test file is a large file consisting of the value 0x61 repeated over the entire 
file.


To troubleshoot this issue, we enabled tracing in qemu and used the 
xen_map_cache and xen_map_cache_return trace events.  We also added some of our 
own debug statements in e1000.c before and after the function call to DMA the 
network packet to the guestOS descriptor address.  Below is a commented summary 
of the trace output:

/*** Check if guestOS address 0xe00000 (which maps to 0x7f15c313f000) is 
corrupted
xen_map_cache want 0xe00000
xen_map_cache_return 0x7f15c313f000
/*** It wasn't corrupted before the dma write
/*** DMA a packet of length 0x5aa containing '0x61616161...' to guestOS at 
address 0x12ffac2 (which maps to 0x7f15c313eac2)
dma write to 12ffac2 len 5aa
xen_map_cache want 0x12ffac2
xen_map_cache_return 0x7f15c313eac2
/*** Check if guestOS address 0xe00000 (which maps to 0x7f15c313f000) is 
corrupted
xen_map_cache want 0xe00000
xen_map_cache_return 0x7f15c313f000
/*** It is corrupted now.
e1000: Corrupted 7: test_buf:5aa 5aa

The DMA address 0x12ffac2 mapped to 0x7f15c313eac2.  When you add the packet 
length, 0x5aa, the result is 0x7f15c313f06c.  This result is 0x6c bytes into 
the mapping of guestOS address 0xe00000, which mapped to 0x7f15c313f000.  If 
you dump 0xe00000 in the guestOS, 0x6c bytes are corrupted.

We believe that the correct fix is to use qemu_ram_ptr_length instead of 
qemu_get_ram_ptr in the function address_space_rw to ensure (from what we can 
tell) that the mapped address is valid for the entire length specified.  It 
looked like this might also be an issue in cpu_physical_memory_write_rom so we 
made the change there as well.

We are fairly new to the qemu source base so we are looking to the community to 
see if this problem has previously been identified and to see if this is the 
correct fix.

Following is the patch

--- orig/exec.c 2013-11-27 16:52:55.000000000 -0600
+++ new/exec.c  2014-02-15 21:58:34.311518000 -0600
@@ -1911,7 +1911,7 @@
             } else {
                 addr1 += memory_region_get_ram_addr(mr);
                 /* RAM case */
-                ptr = qemu_get_ram_ptr(addr1);
+                ptr = qemu_ram_ptr_length(addr1, &l);
                 memcpy(ptr, buf, l);
                 invalidate_and_set_dirty(addr1, l);
             }
@@ -1945,7 +1945,7 @@
                 }
             } else {
                 /* RAM case */
-                ptr = qemu_get_ram_ptr(mr->ram_addr + addr1);
+                ptr = qemu_ram_ptr_length(mr->ram_addr + addr1, &l);
                 memcpy(buf, ptr, l);
             }
         }
@@ -1995,7 +1995,7 @@
         } else {
             addr1 += memory_region_get_ram_addr(mr);
             /* ROM/RAM case */
-            ptr = qemu_get_ram_ptr(addr1);
+            ptr = qemu_ram_ptr_length(addr1, &l);
             memcpy(ptr, buf, l);
             invalidate_and_set_dirty(addr1, l);
         }


David Hoyer
Controller Firmware Development
Array Products Group

NetApp
3718 N. Rock Road
Wichita, KS 67226
316-636-8047 phone
316-617-3677 mobile
[email protected]<mailto:[email protected]>
netapp.com<http://www.netapp.com/?ref_source=eSig>

 [Description: http://media.netapp.com/images/netapp-logo-sig-5.gif]

<<inline: image001.gif>>

[Qemu-devel] e1000 memory corruption in guest OS

Reply via email to