Michael Tokarev wrote on Tue, Nov 19, 2024 at 01:32:33PM +0300: > Thank you for such an accurate, detailed, smart and thoughtful bug report! > It is a real pleasure to read such bug reports.
Glad it was appreciated ! :) > And this is mmap. > > The underlying problem here is that qemu-user don't have a MMU emulation, > it can't arbitrary remap addresses between host and guest. So it have to > choose a virtual address space for the guest at startup, roughly speaking, > a single region of it. And if this address space happens to contain this > process data structures on the host, things goes badly. > > qemu had numerous tweaks in this area, every new release brings a new > portion of changes, but the the very root cause of all this mess needs > significant redesign of the whole thing which aint going to happen any > time soon. Thanks for the explanation, this all makes sense with the fact that disabling kaslr works around the issue. >> - after inspecting kernel patches between 6.1.99 and 6.1.112 the only >> patch that stood out was a kaslr change[1], so I tried to disable it: >> `sysctl kernel.randomize_va_space=0` >> This also worked around the problem. Note I didn't confirm reverting >> the patch also fixes the issue, this is just a guess at this point. >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d3dc52ff36a333c11b831809fcade780fd292c1 Curiosity got me along with the cat, so I went an bisected it. That apparently was a bad guess, the culprit is this one: https://git.kernel.org/linus/b0cde867b80a5e81fcbc0383e138f5845f2005ee x86: Increase brk randomness entropy for 64-bit systems In commit c1d171a00294 ("x86: randomize brk"), arch_randomize_brk() was defined to use a 32MB range (13 bits of entropy), but was never increased when moving to 64-bit. The default arch_randomize_brk() uses 32MB for 32-bit tasks, and 1GB (18 bits of entropy) for 64-bit tasks. Update x86_64 to match the entropy used by arm64 and other 64-bit architectures. I've sent a mail upstream to suggest reverting it for stable branches: https://lkml.kernel.org/r/zz0_-ijh1war3...@codewreck.org Either way, it's related to kaslr as we had guessed. > > - rather unfortunately, checking out qemu from git, applying the > > patches from salsa's debian-bookworm branch and building manually > > as follow also made the problem go away: > > ``` > > mkdir build && cd build > > ../configure --without-default-features --enable-linux-user > > --target-list=aarch64-linux-user --static > > ninja qemu-aarch64 > > mv /usr/bin/qemu-aarch64-static /usr/bin/qemu-aarch64-static.delete > > cp qemu-aarch64 /usr/bin/qemu-aarch64-static > > systemctl restart binfmt-support.service > > ``` > > What a nice and easy trick you do here. Yes, this is the way to go, > it's just rare to see users are able to figure this out :) [semi-offtopic] It should get "easier" (as in don't require installing) with binfmt namespaces and unshare --load-interp to get a namespace where a different interpreter is used, but unfortunately that only landed in 6.7 so not available in bookworm yet, so I didn't get to test it all the way [/offtopic] > But ok. I'll take a closer look at this one. Because debian-bookworm > branch is the one which is used for bookworm qemu build, and it should > produce the same binary (give or take some unimportant details) as in > the debian archive, but apparently it is not producing the same thing. I've rebuilt the same tree with dpkg-buildpackage and can definitely reproduce with that package. Looking further into the configure arguments, I've trimmed down the difference to --disable-pie: just adding that flag makes my hand build reproduce this. afaiu having pie allows the kernel to map qemu to a different place, so I'm not sure if this really fixes the problem or if it just makes it incredibly unlikely to reproduce. I've started a loop to let my reproducer run overnight see if I can reproduce with pie enabled and will report back if it does reproduce (if you don't hear back from me then it's probably good enough of a workaround -- unsure of why pie was disabled in the first place, the qemu-system part of the dpkg build does leave pie on so there must have been a reason?) > Yes, that's what I'd love to see more closely. Please give me a few > days now once you've a working solutions (many of them actually), - > I'm on a business trip now, will have a closer look when I'll return. No hurry here, I've just been feeding my curiosity. The two open questions I had left yesterday are now solved so I'll leave the resolution to you on your own time. Cheers, -- Dominique