On Wed, 11 Oct 2017, Paulo Matos wrote: > On 10/10/17 23:25, Joseph Myers wrote: > > On Tue, 10 Oct 2017, Paulo Matos wrote: > > > >> new test -> FAIL ; New test starts as fail > > > > No, that's not a regression, but you might want to treat it as one (in the > > sense that it's a regression at the higher level of "testsuite run should > > have no unexpected failures", even if the test in question would have > > failed all along if added earlier and so the underlying compiler bug, if > > any, is not a regression). It should have human attention to classify it > > and either fix the test or XFAIL it (with issue filed in Bugzilla if a > > bug), but it's not a regression. (Exception: where a test failing results > > in its name changing, e.g. through adding "(internal compiler error)".) > > > > When someone adds a new test to the testsuite, isn't it supposed to not > FAIL? If is does FAIL, shouldn't this be considered a regression?
Only a regression at the whole-testsuite level (in that "no FAILs" is the desired state). Not a regression in the sense of a regression bug in GCC that might be relevant for release management (something user-visible that worked in a previous GCC version but no longer works). And if e.g. someone added a dg-require-effective-target (for example) line to a testcase, so incrementing all the line numbers in that test, every PASS / FAIL assertion in that test will have its line number increase by 1, so being renamed, so resulting in spurious detection of a regression if you consider new FAILs as regressions (even at the whole-testsuite level, an increased line number on an existing FAIL is not meaningfully a regression). > For this reason all of this issues need to be taken care straight away Well, I think it *does* make sense to do sufficient analysis on existing FAILs to decide if they are testsuite issues or compiler bugs, fix if they are testsuite issues and XFAIL with reference to a bug in Bugzilla if compiler bugs. That is, try to get to the point where no-FAILs is the normal expected testsuite state and it's Bugzilla, not expected-FAILs-not-marked-as-XFAIL, that is used to track regressions and other bugs. > By not being unique, you mean between languages? Yes (e.g. c-c++-common tests in both gcc and g++ tests might have the same name in both .sum files, but should still be counted as different tests). > I assume that two gcc.sum from different builds will always refer to the > same test/configuration when referring to (for example): > PASS: gcc.c-torture/compile/20000105-1.c -O1 (test for excess errors) The problem is when e.g. multiple diagnostics are being tested for on the same line but the "test name" field in the dg-* directive is an empty string for all of them. One possible approach is to automatically (in your regression checking scripts) append a serial number to the first, second, third etc. cases of any given repeated test name in a .sum file. Or you could count such duplicates as being errors that automatically result in red test results, and get fixes for them into GCC as soon as possible. -- Joseph S. Myers jos...@codesourcery.com