Hello Chris,
On 07/09/2020 04:55, Chris Johns wrote:
Hello,
I would like to discuss BSP Test results early in the release cycle in the hope
we avoid the last minute issues we encountered with RTEMS 5 and the "expected"
failure state ticket.
I would like to update this section ...
https://docs.rtems.org/branches/master/user/testing/tests.html#expected-test-states
to state there is to be a ticket for each `expected-fail` test state. I believe
this was the outcome of the discussion that took place. Please correct me if
this is not correct.
The purpose of the `expected-fail` is to aid the accounting of the test results
to let us know if there are any regressions. We need to account for tests that
fail so we can track if a recent commit results in a new failure, i.e. a
regression. To do this we need to capture the state in a way `rtems-test` can
indicate a regression.
at first I was a bit sceptical about the expected-fail state, but you
fully convinced me that it is a good approach. We need some automatic
way to report regressions and the expected-fail helps here.
I think the `indeterminate` state may need further explanation as it will help
in the cases a simulator passes a test but the test fails on some hardware. I am
currently seeing this with spcache01 on the PC BSP.
The spintrcritical* tests had some sporadic failures on simulators. So
sometimes they pass, sometimes they fail, sometimes they time out. I
tried to fix this with the new T_interrupt_test(), but I am not sure if
these tests can be made to work always.
With the level of continuous building and testing we are currently doing being
able to easily determine a regression will become important. Check out the
example below.
I would like to avoid us sitting with failures that do not have tickets and are
not accounted for. I know there is a lump of work to account for the failures
and after that is done I think the effort needed to maintain the failure states
will drop.
As a result I have been pondering how I can encourage this work be done. I am
considering updating the tier-1 status to requiring there be 0 unaccounted for
failures. That is the `rtems-test`'s Failure count is 0 for a hardware test run.
We would like to set up some continuous testing of the upstream
GCC/GDB/Newlib. As it turned out, building the tools is not enough. We
also have to build the tests for selected BSPs to catch compile/link
errors. We also have to run the tests. For this we need a simple error
status from rtems-test, e.g. everything as usual (0) or there was a
regression (1).
Chris
An example using Joel's recent test run (thanks Joel :)). The sparc/leon2
results show no regressions:
Summary
=======
Passed: 580
Failed: 0
User Input: 6
Expected Fail: 1
Indeterminate: 0
Benchmark: 3
Timeout: 0
Invalid: 0
Wrong Version: 0
Wrong Build: 0
Wrong Tools: 0
------------------
Total: 590
[ https://lists.rtems.org/pipermail/build/2020-September/018089.html ]
while the sparc/erc32 has a single failure:
Summary
=======
Passed: 579
Failed: 1
User Input: 6
Expected Fail: 1
Indeterminate: 0
Benchmark: 3
Timeout: 0
Invalid: 0
Wrong Version: 0
Wrong Build: 0
Wrong Tools: 0
------------------
Total: 590
Failures:
spintrcritical08.exe
[ https://lists.rtems.org/pipermail/build/2020-September/018088.html ]
I work on this one. It seems that the 1000us per tick are the issue, the
test passes with 10000us per tick.
--
Sebastian Huber, embedded brains GmbH
Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel