zero_bss() incorrectly assumed that any PT_LOAD containing .bss must be
writable, rejecting valid ELF binaries where .bss overlaps the tail of
an RX file-backed page.

Instead of failing, temporarily enable write access on the overlapping
page to zero the fractional bss range, then restore the original page
permissions once initialization is complete.

To validate the correctness of the modified zero_bss() implementation,
two targeted test cases were constructed, designed to exercise the edge cases 
where
the .bss segment overlaps a partially filled virtual memory page belonging to a
R_X region. The test binaries were intentionally built without a main() function
and instead defined a custom ELF entry-point via the _start symbol.
This approach bypasses CRT, dynamic loader, libc initialization etc. ensuring 
that
execution begins immediately after QEMU completes ELF loading and memory 
mapping.

The first binary defines a minimal _start routine and immediately terminates
via a system call, without ever referencing the .bss symbol. Although a .bss 
section
is present in the ELF, it is not accessed at runtime, and the resulting PT_LOAD
mapping can be established without triggering any writes to a file-backed RX 
page.
In this configuration, QEMU successfully loads the binary, and the loader 
reaches
the zero_bss() path, validating that the fractional .bss region is correctly 
zeroed
without violating the original segment permissions.

The second binary explicitly reads from the global .bss symbol (x) at program 
entry.
This forces the linker to materialize the .bss region within the same PT_LOAD
segment as the RX code, yielding a segment with p_filesz < p_memsz and flags 
R|X.
In this case, QEMU correctly fails during the initial file-backed mmap() of the 
PT_LOAD
segment, returning EINVAL. This behavior is consistent with the Linux kernel’s 
ELF
loader semantics, which prohibit mapping a file-backed segment as RX when it 
(logically)
contains writable memory. Consequently, this failure occurs before zero_bss()
is reached (behaviour expected and correct).

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3179
Signed-off-by: Razvan Ghiorghe <[email protected]>
---
 linux-user/elfload.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 35471c0c9a..fa3f7cda69 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -449,18 +449,11 @@ static bool zero_bss(abi_ulong start_bss, abi_ulong 
end_bss,
 {
     abi_ulong align_bss;
 
-    /* We only expect writable bss; the code segment shouldn't need this. */
-    if (!(prot & PROT_WRITE)) {
-        error_setg(errp, "PT_LOAD with non-writable bss");
-        return false;
-    }
-
     align_bss = TARGET_PAGE_ALIGN(start_bss);
     end_bss = TARGET_PAGE_ALIGN(end_bss);
 
     if (start_bss < align_bss) {
         int flags = page_get_flags(start_bss);
-
         if (!(flags & PAGE_RWX)) {
             /*
              * The whole address space of the executable was reserved
@@ -472,20 +465,35 @@ static bool zero_bss(abi_ulong start_bss, abi_ulong 
end_bss,
              */
             align_bss -= TARGET_PAGE_SIZE;
         } else {
+            abi_ulong start_page_aligned = start_bss & TARGET_PAGE_MASK;
             /*
-             * The start of the bss shares a page with something.
-             * The only thing that we expect is the data section,
-             * which would already be marked writable.
-             * Overlapping the RX code segment seems malformed.
+             * The logical OR between flags and PAGE_WRITE works because
+             * in include/exec/page-protection.h they are defined as PROT_*
+             * values, matching mprotect().
+             * Temporarily enable write access to zero the fractional bss.
+             * target_mprotect() handles TB invalidation if needed.
              */
             if (!(flags & PAGE_WRITE)) {
-                error_setg(errp, "PT_LOAD with bss overlapping "
-                           "non-writable page");
-                return false;
+                if (target_mprotect(start_page_aligned,
+                                    TARGET_PAGE_SIZE,
+                                    prot | PAGE_WRITE) == -1) {
+                    error_setg_errno(errp, errno,
+                                    "Error enabling write access for bss");
+                    return false;
+                }
             }
 
-            /* The page is already mapped and writable. */
+            /* The page is already mapped and now guaranteed writable. */
             memset(g2h_untagged(start_bss), 0, align_bss - start_bss);
+
+            if (!(flags & PAGE_WRITE)) {
+                if (target_mprotect(start_page_aligned,
+                                    TARGET_PAGE_SIZE, prot) == -1) {
+                    error_setg_errno(errp, errno,
+                                    "Error restoring bss first permissions");
+                    return false;
+                }
+            }
         }
     }
 
-- 
2.43.0


Reply via email to