On 11/27/25 12:02 PM, Daniel P. Berrangé wrote: > On Thu, Nov 27, 2025 at 10:56:12AM +0100, Kevin Wolf wrote: >> Am 25.11.2025 um 15:21 hat [email protected] geschrieben: >>> From: Andrey Drobyshev <[email protected]> >>> >>> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in >>> coredumps") introduced coroutine traces in coredumps using raw stack >>> unwinding. While this works, this approach does not allow to view the >>> function arguments in the corresponding stack frames. >>> >>> As an alternative, we can obtain saved registers from the coroutine's >>> jmpbuf, copy the original coredump file into a temporary file, patch the >>> saved registers into the tmp coredump's struct elf_prstatus and execute >>> another gdb subprocess to get backtrace from the patched temporary coredump. >>> >>> While providing more detailed info, this alternative approach, however, is >>> quite heavyweight as it takes significantly more time and disk space. >>> So, instead of making it a new default, let's keep raw unwind the default >>> behaviour, but add the '--detailed' option for 'qemu bt' and 'qemu >>> coroutine' >>> command which would enforce the new behaviour. >>> [...] >> >>> +def clone_coredump(source, target, set_regs): >>> + shutil.copyfile(source, target) >>> + write_regs_to_coredump(target, set_regs) >>> + >>> +def dump_backtrace_patched(regs): >>> + files = gdb.execute('info files', False, True).split('\n') >>> + executable = re.match('^Symbols from "(.*)".$', files[0]).group(1) >>> + dump = re.search("`(.*)'", files[2]).group(1) >>> + >>> + with tempfile.NamedTemporaryFile(dir='/tmp', delete=False) as f: >>> + tmpcore = f.name >>> + >>> + clone_coredump(dump, tmpcore, regs) >> >> I think this is what makes it so heavy, right? Coredumps can be quite >> large and /tmp is probably a different filesystem, so you end up really >> copying the full size of the coredump around. > > On my system /tmp is tmpfs, so this is actually bringing the whole > coredump into RAM which is not a sensible approach. > >> Wouldn't it be better in the general case if we could just do a reflink >> copy of the coredump and then do only very few writes for updating the >> register values? Then the overhead should actually be quite negligible >> both in terms of time and disk space. >
That's correct, copying the file to /tmp takes most of the time with this approach. As for reflink copy, this might've been a great solution. However, it would largely depend on the FS used. E.g. in my system coredumpctl places uncompressed coredump at /var/tmp, which is mounted as ext4. And in this case: # cp --reflink /var/tmp/coredump-MQCZQc /root cp: failed to clone '/root/coredump-MQCZQc' from '/var/tmp/coredump-MQCZQc': Invalid cross-device link # cp --reflink /var/tmp/coredump-MQCZQc /var/tmp/coredump.ref cp: failed to clone '/var/tmp/coredump.ref' from '/var/tmp/coredump-MQCZQc': Operation not supported Apparently, ext4 doesn't support reflink copy. xfs and btrfs do. But I guess our implementation better be FS-agnostic. > Personally I'd be fine with just modifying the core dump in place > most of the time. I don't need to keep the current file untouched, > as it is is just a temporary download acquired from systemd's > coredumpctl, or from a bug tracker. > > Hmm, that's an interesting proposal. But I still see some potential pitfalls with it: 1. When dealing with the core dump stored by coredumpctl, original file is indeed stored compressed and not being modified. We don't really care about the uncompressed temporary dump placed in /var/tmp. What we do care about is that current GDB session keeps working smoothly. I tried patching the dump in place without copying, and it doesn't seem to break subsequent commands. However GDB keeps the temporary dump open throughout the whole session, which means it can occasionally read modified data from it. I'm not sure that we have a solid guarantee that things will keep working with the patched dump. 2. If we're dealing with an external core dump downloaded from a bug report, we surely want to be able to create new GDB sessions with it. That means we'll want its unmodified version. Having to re-download it again is even slower than plain copying. The solution to both problems would be saving original registers and patching them back into the core dump once we've obtained our coroutine trace. It's still potentially fragile in 2nd case if GDB process abruptly gets killed/dies leaving registers un-restored. But I guess we can live with it? What do you think? > With regards, > Daniel
