Bug#897607: dbus: Please raise timeout for a few test cases (or disable) in riscv64

Simon McVittie Thu, 03 May 2018 10:13:17 -0700

Control: tags -1 + moreinfo

On Thu, 03 May 2018 at 16:30:19 +0200, Manuel A. Fernandez Montecelo wrote:
> This package fails to build for the riscv64 architecture, because these two 
> test
> cases don't pass:
> 
>   
> https://buildd.debian.org/status/fetch.php?pkg=dbus&arch=riscv64&ver=1.12.8-2&stamp=1525355576&raw=0
> 
>   ERROR: test-monitor - Bail out! Test timed out (GLib main loop timeout 
> callback reached)
>   ERROR: test-relay - Bail out! Test timed out (SIGALRM received)


As you conjectured, those arbitrary timeouts are meant to be there to stop
the tests from waiting forever if there's a deadlock or infinite loop bug.

> So there are high chances that it's only that, slow "hardware".

Looking at the results of other tests, it's taking between 40 and 100
times as long as my laptop for some tests that succeeded; so yes, very
slow "hardware".

I'm a little reluctant to just add zeroes to the timeout until it succeeds;
building and testing dbus takes 10 minutes on x86 and 40 minutes on mips,
and I suspect you don't want it to take 16 hours (10 minutes * 100x
worst-case factor) to run the build and the tests on riscv64 :-)
At the same time, test coverage is better than no test coverage, so I
don't really want to just disable them.

Please could you try the build on a representative
riscv64 emulator with the attached hack, and send
the resulting debian/build-main/test/test-monitor.log and
debian/build-main/test/test-relay.log to this bug? If it fails, increase
the numbers as desired (they're an arbitrary number of minutes per
test-case), but either way I'd like to see the logs so that I can tell
how much margin of error we get.

Alternatively, do you have a pre-prepared riscv64 qemu image that I could
try, or some other way to get the equivalent of a porterbox? How fast
is your slowest host machine for this qemu-system-riscv64 - hopefully
at least as fast as my laptop?

The relay test has a timeout of 60 seconds each for 3 test cases, in
which the one that is timing out sends a flood of 8192 messages in an
attempt to shake out a timing-related bug. I could bump that up by a
factor of something like 10 (as is done in the attached hack), or skip
it on slow architectures. I'd be reluctant to increase it to much more
than 10 minutes, because that's the length of time a developer will have
to wait for test results if they have introduced a deadlock.

The monitor test has a timeout of 60 seconds for 17 test cases, none
of which involve particularly many messages - I suspect it's just
taking a few seconds per test case, and because there are 17 of them,
it adds up. I could relax it to 60 seconds for each of the 17 test
cases without losing much (as is done in the attached hack), or I
could multiply up the timeout if absolutely necessary.

At the moment the timeout is architecture-independent, but it would be
straightforward to multiply it by 2 or 5 or 10 on selected architectures.

    smcv

diff --git a/test/monitor.c b/test/monitor.c
index df5a7180..b9a230e2 100644
--- a/test/monitor.c
+++ b/test/monitor.c
@@ -465,6 +465,8 @@ static void
 setup (Fixture *f,
     gconstpointer context)
 {
+  test_timeout_reset (1);
+
   f->config = context;
 
   f->ctx = test_main_context_get ();
diff --git a/test/relay.c b/test/relay.c
index 00e7966a..97dfec64 100644
--- a/test/relay.c
+++ b/test/relay.c
@@ -122,7 +122,7 @@ static void
 setup (Fixture *f,
     gconstpointer data G_GNUC_UNUSED)
 {
-  test_timeout_reset (1);
+  test_timeout_reset (10);
 
   f->ctx = test_main_context_get ();
   dbus_error_init (&f->e);

Bug#897607: dbus: Please raise timeout for a few test cases (or disable) in riscv64

Reply via email to