Hi,

On 28-05-2025 10:56, Helmut Grohne wrote:
I uploaded 0.4.2 limiting concurrency to 4 CPUs at most.


Thanks. Only later I realized that while amd64 has 64 cores, e.g. ppc64el and s390x also have quite some cores available (16 resp. 10), but there we haven't seen the fallout.

I scheduled 10
runs on amd64. https://ci.debian.net/user/helmutg/jobs Two of them
failed in debefivm and not in debvm. It's still flaky, but maybe less so
(or luck).


That's at the limit where I typically file these flaky bugs.

Given the output, I'm quite sure that qemu actually hangs and
that increasing any timeout does not buy us anything. To me, this feels
like qemu and/or linux being randomly buggy.


If you're suspecting the kernel, I'll be upgrading the host soon to trixie. I might pull that forward.

I note that I have never reproduced the specific failure mode outside
ci.d.n.


Ack.

Any suggestions for how to move forward from here?


If you think it could help you debug, I (or terceiro) can give you access to a testbed on the ci.d.n infra (while I'm on-line).

Paul

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to