David Malcolm wrote:
On Wed, 2020-07-22 at 17:05 -0500, Jacob Bachmeyer wrote:
[...]
In terms of "other comments":
FWIW within gcc, the jit.dg testsuite for libgccjit has a copy of
host_execute ("fixed_host_execute") to workaround issues I've run into:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/jit.dg/jit.exp
It has drifted somewhat from the DejaGnu original; for example it
gained the ability to parse valgrind output and convert leaks into
pass/fail results.
The only differences seem to be support for running the test under
valgrind, an expect_after to raise an error if the Expect matching
buffer overflows, and some additional code to check the exit status of
the invoked program.
I have already added a different solution to ensuring that the child
process has time to finish: instead of immediately closing the spawn
handle when the {^Totals} line is reached, DejaGnu now (after the
PR42399 fix) reads to EOF. This is not really a complete solution
either, as it only works with local testing, and eventually the DejaGnu
unit test protocol will need an explicit end marker to allow for remote
testing that returns to a shell prompt. This will need considerably
more planning to account for cases where a unit test executable crashes
instead of returning, including possibly crashing the remote host entirely.
I am unsure if DejaGnu should even recognize the "Totals" line, as it
does not fit the overall pattern of {^\t[][[:upper:]]+:}. The internal
unit tests for unit testing do not generate a "Totals" line at all.
Similarly, the full_buffer handling has been included into the main
expect call in host_execute to fix that issue. Instead of aborting the
test, upstream DejaGnu will log an ERROR (causing the next test to be
recorded as UNRESOLVED) and attempt to resynchronize. I believe that
the unit testing protocol can support this, although we do not currently
have a test for this recovery sequence.
Generally, checking the exit status does not seem to be long-term
supportable, particularly with the future plans for transparent remote
testing, where an exit status may not be available on some host
platforms. Please do not rely on it.
Future development should make this easier, but running the test under
valgrind should be currently possible with a wrapper instead of
replacing host_execute.
Overall, 1.6.3 should enable you to replace fixed_host_execute with a
new wrap_host_execute that handles using valgrind with a call to
DejaGnu's host_execute procedure. (Also fixed as part of tests for
PR42399: host_execute no longer insists on running executables from the
current directory. This was needed to make a regression test (written
in Awk) for PR42399 run correctly.)
(I also ran into the issue that dejagnu.h's pass/fail C functions
aren't thread-safe, which I hack around in my testsuite, replacing them
in multi-threaded tests with ones guarded by a mutex).
Looking at dejagnu.h, I see a few problems; the use of a shared static
buffer is definitely one of them. Would changing those to make
dejagnu.h thread-safe be a sufficient concern to do for 1.6.3? If so,
please file a bug report at <bug-deja...@gnu.org>; I see a solution
using flockfile and vprintf that avoids building up a buffer at all and
would also fix the minor issue of truncating long test names. This will
be in 1.6.4 in any case, but if thread-safety in dejagnu.h is important
to you, please file a bug and we will work to land this as a bug fix for
1.6.3 instead of an enhancement for 1.6.4.
The internal total counts are a less severe problem, as the DejaGnu test
driver will perform its own count as it reads the results. The C code
already plays fast and loose with counting results: xfail/xpass bump
the same counters as fail/pass, respectively.
-- Jacob