Control: tags -1 + moreinfo On Thu, 03 May 2018 at 16:30:19 +0200, Manuel A. Fernandez Montecelo wrote: > This package fails to build for the riscv64 architecture, because these two > test > cases don't pass: > > > https://buildd.debian.org/status/fetch.php?pkg=dbus&arch=riscv64&ver=1.12.8-2&stamp=1525355576&raw=0 > > ERROR: test-monitor - Bail out! Test timed out (GLib main loop timeout > callback reached) > ERROR: test-relay - Bail out! Test timed out (SIGALRM received)
As you conjectured, those arbitrary timeouts are meant to be there to stop the tests from waiting forever if there's a deadlock or infinite loop bug. > So there are high chances that it's only that, slow "hardware". Looking at the results of other tests, it's taking between 40 and 100 times as long as my laptop for some tests that succeeded; so yes, very slow "hardware". I'm a little reluctant to just add zeroes to the timeout until it succeeds; building and testing dbus takes 10 minutes on x86 and 40 minutes on mips, and I suspect you don't want it to take 16 hours (10 minutes * 100x worst-case factor) to run the build and the tests on riscv64 :-) At the same time, test coverage is better than no test coverage, so I don't really want to just disable them. Please could you try the build on a representative riscv64 emulator with the attached hack, and send the resulting debian/build-main/test/test-monitor.log and debian/build-main/test/test-relay.log to this bug? If it fails, increase the numbers as desired (they're an arbitrary number of minutes per test-case), but either way I'd like to see the logs so that I can tell how much margin of error we get. Alternatively, do you have a pre-prepared riscv64 qemu image that I could try, or some other way to get the equivalent of a porterbox? How fast is your slowest host machine for this qemu-system-riscv64 - hopefully at least as fast as my laptop? The relay test has a timeout of 60 seconds each for 3 test cases, in which the one that is timing out sends a flood of 8192 messages in an attempt to shake out a timing-related bug. I could bump that up by a factor of something like 10 (as is done in the attached hack), or skip it on slow architectures. I'd be reluctant to increase it to much more than 10 minutes, because that's the length of time a developer will have to wait for test results if they have introduced a deadlock. The monitor test has a timeout of 60 seconds for 17 test cases, none of which involve particularly many messages - I suspect it's just taking a few seconds per test case, and because there are 17 of them, it adds up. I could relax it to 60 seconds for each of the 17 test cases without losing much (as is done in the attached hack), or I could multiply up the timeout if absolutely necessary. At the moment the timeout is architecture-independent, but it would be straightforward to multiply it by 2 or 5 or 10 on selected architectures. smcv
diff --git a/test/monitor.c b/test/monitor.c index df5a7180..b9a230e2 100644 --- a/test/monitor.c +++ b/test/monitor.c @@ -465,6 +465,8 @@ static void setup (Fixture *f, gconstpointer context) { + test_timeout_reset (1); + f->config = context; f->ctx = test_main_context_get (); diff --git a/test/relay.c b/test/relay.c index 00e7966a..97dfec64 100644 --- a/test/relay.c +++ b/test/relay.c @@ -122,7 +122,7 @@ static void setup (Fixture *f, gconstpointer data G_GNUC_UNUSED) { - test_timeout_reset (1); + test_timeout_reset (10); f->ctx = test_main_context_get (); dbus_error_init (&f->e);