Re: How to Classify Intermittent Test Failures

Chris Johns Tue, 02 Feb 2021 15:28:46 -0800

On 3/2/21 3:13 am, Gedare Bloom wrote:
> On Tue, Feb 2, 2021 at 7:40 AM Joel Sherrill <[email protected]
> <mailto:[email protected]>> wrote:
>     On Mon, Feb 1, 2021 at 6:50 PM Chris Johns <[email protected]
>     <mailto:[email protected]>> wrote:
>         On 2/2/21 9:12 am, Joel Sherrill wrote:
>         >  On Mon, Feb 1, 2021 at 3:50 PM Chris Johns <[email protected]
>         <mailto:[email protected]>
>         > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>         >     On 2/2/21 3:42 am, Joel Sherrill wrote:
>         >     > Hi
>         >     >
>         >     > On the aarch64 qemu testing, we are seeing some tests which 
> seem
>         to pass
>         >     most of
>         >     > the time but fail intermittently. It appears to be based
>         somewhat on host load
>         >     > but there may be other factors. 
>         >     >
>         >     > There does not appear to be a good test results state for 
> these.
>         Marking them
>         >     > expected pass or fail means they will get flagged incorrectly
>         sometimes.
>         >
>         >     We have the test state 'indeterminate' ...
>         >
>         >   
>          
> https://docs.rtems.org/branches/master/user/testing/tests.html#expected-test-states
>         
> <https://docs.rtems.org/branches/master/user/testing/tests.html#expected-test-states>
>         >   
>          
> <https://docs.rtems.org/branches/master/user/testing/tests.html#expected-test-states
>         
> <https://docs.rtems.org/branches/master/user/testing/tests.html#expected-test-states>>
>         >
>         >     It is for this type of test result.
>         >
>         >     > I don't see not running them as a good option. Beyond adding a
>         new state to
>         >     > reflect this oddity, any suggestions?
>         >
>         >     I prefer we used the already defined and documented state.
>         >
>         > +1 
>         >
>         > Kinsey had already marked them as indeterminate and the guys were 
> in the 
>         > process of documenting why. I interpreted the question of what to 
> do more 
>         > broadly than it needed to be but the discussion was good.
> 
>         A discussion is needed and welcome. Handling these intermittent 
> simulator
>         failures is hard. I once looked into some gdb simulator cases when I
>         first put
>         rtems-test together and found myself quickly heading into a deep dark
>         hole. I
>         have not been back since.
> 
> 
>     Agreed it is ugly.
> 
>     If the BSP has a simulator variant, then using the test configuration is
>     appropriate.
> 
>     But for the PC and leon3, we don't have separate sim builds of the BSP so 
> if 
>     there are intermittent failures there, we would have to mark them in the 
>     set shared with hardware test runs. That's bad.
> 
> yeah, don't do that.


Agreed. Simulation is nice and important but it is a second tier test and
development frame work and real hardware is tier 1. I am not in favour of a
cloned BSP to indication the intended platform is a simulator to categorise
these types of failures.

As I stated before this is a deep hole. For a BSP like the PC, LEON3, ARM and
RISCV setting _any_ test state based on a result when using a simulator requires
you perform extensive testing on hardware to determine the test result is
specific to simulation and not a general failure. If it is specific to
simulation you then need to ask the question, does the simulator have an issue
or does the test itself have some issues that get exposed on a loaded server
running multiple simulations? For example does the simulated timer's clock track
the CPU cycles simulated when loaded?

>     It's almost like we might need a conditional like "sp04: intermittent  
> sim=qemu"
>     or something. Which means build it but the tester ini could know the 
> simulator
>     type and adjust its expectations. May have to account for multiple 
> simulators
>     on the sim=XXX though. Just a thought. 
> 
> maybe pass a sim.tcfg file to tester that is different for the sim.cfg file 
> than
> it is for the hw.cfg file?

Testing does not work this way. A test executable by design contains all the
information about the test and the outcome. Managing external files with state
information is something I decided was too difficult and fragile at best.

Chris
_______________________________________________
devel mailing list
[email protected]
http://lists.rtems.org/mailman/listinfo/devel

Re: How to Classify Intermittent Test Failures

Reply via email to