On 06. 02. 26, 12:54, Matthieu Baerts wrote:
Our CI for the MPTCP subsystem is now regularly hitting various stalls before even starting the MPTCP test suite. These issues are visible on top of the latest net and net-next trees, which have been sync with Linus' tree yesterday. All these issues have been seen on a "public CI" using GitHub-hosted runners with KVM support, where the tested kernel is launched in a nested (I suppose) VM. I can see the issue with or without debug.config. According to the logs, it might have started around v6.19-rc0, but I was unavailable for a few weeks, and I couldn't react quicker, sorry for that. Unfortunately, I cannot reproduce this locally, and the CI doesn't currently have the ability to execute bisections.
Hmm, after the switch of the qemu guest kernels to 6.19, our (opensuse) build service is stalling in smp_call_function_many_cond() randomly too:
https://bugzilla.suse.com/show_bug.cgi?id=1258936 The attachment from there contains sysrq-t logs too: https://bugzilla.suse.com/attachment.cgi?id=888612
The stalls happen before starting the MPTCP test suite. The init program creates a VSOCK listening socket via socat [1], and different hangs are then visible: RCU stalls followed by a soft lockup [2], only a soft lockup [3], sometimes the soft lockup comes with a delay [4] [5], or there is no RCU stalls or soft lockups detected after one minute, but VM is stalled [6]. In the last case, the VM is stopped after having launched GDB to get more details about what was being executed. It feels like the issue is not directly caused by the VSOCK listening socket, but the stalls always happen after having started the socat command [1] in the background.
It fails randomly while building random packages (go, libreoffice, bayle, ...). I don't think it is VSOCK related in those cases, but who knows what the builds do...
I cannot reproduce locally either. I came across: 614da1d3d4cd x86: make page fault handling disable interrupts properly but I have no idea if it could have impact on this at all. thanks, -- js suse labs

