Hello Chris,

On 07/09/2020 04:55, Chris Johns wrote:
Hello,

I would like to discuss BSP Test results early in the release cycle in the hope
we avoid the last minute issues we encountered with RTEMS 5 and the "expected"
failure state ticket.

I would like to update this section ...

https://docs.rtems.org/branches/master/user/testing/tests.html#expected-test-states

to state there is to be a ticket for each `expected-fail` test state. I believe
this was the outcome of the discussion that took place. Please correct me if
this is not correct.

The purpose of the `expected-fail` is to aid the accounting of the test results
to let us know if there are any regressions. We need to account for tests that
fail so we can track if a recent commit results in a new failure, i.e. a
regression. To do this we need to capture the state in a way `rtems-test` can
indicate a regression.

at first I was a bit sceptical about the expected-fail state, but you fully convinced me that it is a good approach. We need some automatic way to report regressions and the expected-fail helps here.


I think the `indeterminate` state may need further explanation as it will help
in the cases a simulator passes a test but the test fails on some hardware. I am
currently seeing this with spcache01 on the PC BSP.

The spintrcritical* tests had some sporadic failures on simulators. So sometimes they pass, sometimes they fail, sometimes they time out. I tried to fix this with the new T_interrupt_test(), but I am not sure if these tests can be made to work always.


With the level of continuous building and testing we are currently doing being
able to easily determine a regression will become important. Check out the
example below.

I would like to avoid us sitting with failures that do not have tickets and are
not accounted for. I know there is a lump of work to account for the failures
and after that is done I think the effort needed to maintain the failure states
will drop.

As a result I have been pondering how I can encourage this work be done. I am
considering updating the tier-1 status to requiring there be 0 unaccounted for
failures. That is the `rtems-test`'s Failure count is 0 for a hardware test run.

We would like to set up some continuous testing of the upstream GCC/GDB/Newlib. As it turned out, building the tools is not enough. We also have to build the tests for selected BSPs to catch compile/link errors. We also have to run the tests. For this we need a simple error status from rtems-test, e.g. everything as usual (0) or there was a regression (1).


Chris

An example using Joel's recent test run (thanks Joel :)). The sparc/leon2
results show no regressions:

Summary
=======

Passed:        580
Failed:          0
User Input:      6
Expected Fail:   1
Indeterminate:   0
Benchmark:       3
Timeout:         0
Invalid:         0
Wrong Version:   0
Wrong Build:     0
Wrong Tools:     0
------------------
Total:         590

[ https://lists.rtems.org/pipermail/build/2020-September/018089.html ]

while the sparc/erc32 has a single failure:

Summary
=======

Passed:        579
Failed:          1
User Input:      6
Expected Fail:   1
Indeterminate:   0
Benchmark:       3
Timeout:         0
Invalid:         0
Wrong Version:   0
Wrong Build:     0
Wrong Tools:     0
------------------
Total:         590

Failures:
  spintrcritical08.exe

[ https://lists.rtems.org/pipermail/build/2020-September/018088.html ]

I work on this one. It seems that the 1000us per tick are the issue, the test passes with 10000us per tick.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Reply via email to