Hi, I've just run the src:apparmor autopkgtests on ci-worker05 a bunch of times, using the autopkgtest command:
- from an unpacked source tree: - experimental (3.0.1-4): 5 times - unstable (2.13.6-8): 3 times - using the package that's in the archive: - experimental (3.0.1-4): 2 times - unstable (2.13.6-8): 2 times - Buster (2.13.2-10): 2 times Every time, the test succeeded, then the autopkgtest command exited immediately, and the LXC container was promptly stopped and destroyed. IOW, I was not able to reproduce the bug. So either I've been incredibly lucky, or the bug has already been fixed somehow, or the bug is only triggered when the autopkgtest is started by the debci worker service. Any idea on how to proceed from here? I suppose someone would have to be logged into the relevant worker when the bug happens, in order to investigate further. OTOH it's also tempting to wait until you upgrade the ci.d.n infra to Bullseye and come back to it at that point, especially if there's a simple short-term mitigation: Paul Gevers wrote: > Something in the test is very often preventing autopkgtest (the > binary) from stopping and cleaning up the lxc container within the > 600 seconds it gets to do that, which leads to a tmpfail for the > apparmor autopkgtest and a still running lxc container on the > worker. […] Your autopkgtest itself normally passes before causing > the issue. I understand the LXC container is stopped and restarted between each test, so what Paul wrote above suggests the container is successfully stopped between the 1st and 2nd test, same between the 2nd and 3rd test, but then it occasionally fails to stop after the last (3rd) test. Correct? If this analysis is correct, then the culprit has to be in the 3rd test i.e.: # Dummy test so that changes to linux-image-amd64 trigger our other autopkgtests # on ci.debian.net Test-Command: /bin/true Depends: linux-image-amd64 [amd64] | linux-image-generic [ amd64 ] Restrictions: superficial, skip-not-installable … and this is the one I should mark it as isolation-machine, so we can resume running the other 2 tests on ci.d.n, which I would very much like. Makes sense? Cheers!