Your message dated Thu, 10 Nov 2022 13:52:56 +0100
with message-id <e06cc99e-98b9-a3c3-04b1-afdf7ee5c...@debian.org>
and subject line Re: Bug#1004107: meson: flaky autopkgtest on armhf: dictionary
changed size during iteration -> timeout
has caused the Debian Bug report #1004107,
regarding meson: flaky autopkgtest on armhf: dictionary changed size during
iteration -> timeout
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)
--
1004107: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004107
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Source: meson
Version: 0.56.2-1
Severity: serious
X-Debbugs-CC: debian...@lists.debian.org
User: debian...@lists.debian.org
Usertags: flaky timeout
Dear maintainer(s),
I looked at the results of the autopkgtest of you package on armhf
because it was showing up as a regression for the upload of
python-defaults and setuptools. I noticed that the test regularly fails,
what's worse, it also seems to hang as the test is killed because it
hits an autopkgtest timeout.
Because the unstable-to-testing migration software now blocks on
regressions in testing, flaky tests, i.e. tests that flip between
passing and failing without changes to the list of installed packages,
are causing people unrelated to your package to spend time on these
tests. In this case, Release Team members had to investigate if curl was
OK to go into the next Stable point release.
Don't hesitate to reach out if you need help and some more information
from our infrastructure. Please note that the host we run our armhf
tests on is very powerful. It has 160 cores and 255 GB RAM. This is
sometimes the root cause of test that fail. It seems that before we
switch to this host, the test was more reliable.
Paul
https://ci.debian.net/packages/m/meson/testing/amd64/
E.g.
https://ci.debian.net/data/autopkgtest/testing/armhf/m/meson/18519155/log.gz
Ran 462 tests in 569.524s
OK (skipped=66)
Meson build system 0.61.0 Unit Tests
pytest-xdist not found, using unittest instead
Total time: 569.540 seconds
Meson build system 0.61.0 Project Tests
Using python 3.9.9 (main, Jan 12 2022, 16:10:51)
host machine compilers
c : [gcc] cc (gcc 11.2.0 "cc (Debian 11.2.0-13) 11.2.0")
cpp : [gcc] c++ (gcc 11.2.0 "c++ (Debian 11.2.0-13) 11.2.0")
cs : [mono] mcs (mono 6.8.0.105)
cuda : [not found]
cython : [not found]
d : [llvm] ldc2 (llvm 1.28.0 "LDC - the LLVM D compiler (1.28.0):")
fortran: [gcc] gfortran (gcc 11.2.0 "GNU Fortran (Debian 11.2.0-13)
11.2.0")
java : [unknown] javac (unknown 11.0.13)
objc : [gcc] cc (gcc 11.2.0)
objcpp : [gcc] c++ (gcc 11.2.0)
rust : [rustc] rustc -C linker=cc (rustc 1.56.0)
swift : [not found]
vala : [valac] valac (valac 0.54.6)
tools
ninja : /usr/bin/ninja (1.10.1)
cmake : /usr/bin/cmake (3.22.1)
hotdoc : not found
Checking that configuring works...
Checking that introspect works...
Checking that building works...
Checking that testing works...
Checking that installing works...
Running tests with 160 workers
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/usr/lib/python3.9/concurrent/futures/process.py", line 317, in run
result_item, is_broken, cause = self.wait_result_broken_or_wakeup()
File "/usr/lib/python3.9/concurrent/futures/process.py", line 376, in
wait_result_broken_or_wakeup
worker_sentinels = [p.sentinel for p in self.processes.values()]
File "/usr/lib/python3.9/concurrent/futures/process.py", line 376, in
<listcomp>
worker_sentinels = [p.sentinel for p in self.processes.values()]
RuntimeError: dictionary changed size during iteration
Running cmake tests.
autopkgtest [16:04:29]: ERROR: timed out on command "su -s /bin/bash
debci -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 ||
true; . ~/.profile >/dev/null 2>&1 || true;
buildtree="/tmp/autopkgtest-lxc.f3gr65px/downtmp/build.IRe/src"; mkdir
-p -m 1777 --
"/tmp/autopkgtest-lxc.f3gr65px/downtmp/exhaustive-artifacts"; export
AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest-lxc.f3gr65px/downtmp/exhaustive-artifacts";
export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755
"/tmp/autopkgtest-lxc.f3gr65px/downtmp/autopkgtest_tmp"; export
AUTOPKGTEST_TMP="/tmp/autopkgtest-lxc.f3gr65px/downtmp/autopkgtest_tmp";
export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive;
export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=160; unset
LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY
LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo
$$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f
/tmp/autopkgtest_script_pid" EXIT INT QUIT PIPE; cd "$buildtree"; chmod
+x
/tmp/autopkgtest-lxc.f3gr65px/downtmp/build.IRe/src/debian/tests/exhaustive;
touch /tmp/autopkgtest-lxc.f3gr65px/downtmp/exhaustive-stdout
/tmp/autopkgtest-lxc.f3gr65px/downtmp/exhaustive-stderr;
/tmp/autopkgtest-lxc.f3gr65px/downtmp/build.IRe/src/debian/tests/exhaustive
2> >(tee -a /tmp/autopkgtest-lxc.f3gr65px/downtmp/exhaustive-stderr >&2)
> >(tee -a /tmp/autopkgtest-lxc.f3gr65px/downtmp/exhaustive-stdout);"
(kind: test)
autopkgtest [16:04:30]: test exhaustive: -----------------------]
OpenPGP_signature
Description: OpenPGP digital signature
--- End Message ---
--- Begin Message ---
Version: 0.62.2-1
Hi,
On Fri, 6 May 2022 23:25:38 +0300 Jussi Pakkanen <jpakk...@gmail.com> wrote:
On Thu, 5 May 2022 at 22:39, Paul Gevers <elb...@debian.org> wrote:
> It just occurred to me that it may be useful to try and reduce the
> number of concurrent running tests to something you would expect on a
> more normal computer (under conditions where the framework is better
> tested). Our armel host has 160 cores, similar, our amd64 ci-worker13
> host has 56.
No harm in trying I guess:
https://github.com/mesonbuild/meson/pull/10358
I think that worked. At least with version 0.62.2-1 and later I haven't
seen the timeout (I assume the change came in that version). On top of
that, we now run armhf in restricted VM's with only 16 cores, so if it
was really too much parallelism, it's also mitigated on our side.
Let's close this bug.
Paul
OpenPGP_signature
Description: OpenPGP digital signature
--- End Message ---