Hi, On 28-05-2025 10:56, Helmut Grohne wrote:
I uploaded 0.4.2 limiting concurrency to 4 CPUs at most.
Thanks. Only later I realized that while amd64 has 64 cores, e.g. ppc64el and s390x also have quite some cores available (16 resp. 10), but there we haven't seen the fallout.
I scheduled 10 runs on amd64. https://ci.debian.net/user/helmutg/jobs Two of them failed in debefivm and not in debvm. It's still flaky, but maybe less so(or luck).
That's at the limit where I typically file these flaky bugs.
Given the output, I'm quite sure that qemu actually hangs and that increasing any timeout does not buy us anything. To me, this feels like qemu and/or linux being randomly buggy.
If you're suspecting the kernel, I'll be upgrading the host soon to trixie. I might pull that forward.
I note that I have never reproduced the specific failure mode outside ci.d.n.
Ack.
Any suggestions for how to move forward from here?
If you think it could help you debug, I (or terceiro) can give you access to a testbed on the ci.d.n infra (while I'm on-line).
Paul
OpenPGP_signature.asc
Description: OpenPGP digital signature