On Tue, 2025-07-01 at 23:04 +0100, Joern Wolfgang Rennecke wrote: > Quite often I see a test quickly written to test some new feature > (bug > fix, extension or optimization) that has a couple of functions to > cover > various aspects of the feature, checked all together with a single > scan-tree-dump-times, scan-rtl-dump-times etc. check, using the > expected > value for the target of the test writer. > Or worse, it's all packed into one giant function, with unpredictable > interactions between the different pieces of code. I think we have > less > of those recently, but please don't interpret this post as a > suggestion > to fall back to this practice. > > Quite often it turns out that the feature applies only to some of the > functions / sites on some targets. The first reaction is often to > create multiple copies of the scan-*-dump-times stanza, with mutually > exclusive conditions for each copy, which might look harmless when > there > are only two cases, but as more are added, it quickly turns into an > unmaintainable mess of lots dejagnu directives with complicated. > > This can get even worse if different targets can get the compiler the > pattern multiple times for the same piece of source, like for > vectorization that is tried with different vectorization factors. > > I think we should discuss what is best practice to address these > problems efficiently, and to preferably write new tests avoiding them > in the first place. > > When each function has a single site per feature where success is > given > if the pattern appears at least once, a straightforward solution that > has already been used a number of times is to split the test into > multiple smaller tests. The main disadvantages of this approach are > that a large set of small files can clutter the directory where they > appear, making it less maintainable, and that the compiler is invoked > more often, generally with the same set of include files read each > time, > thus making the test runs slower. > > Another approach would be to use source line numbers, where present > and > distinctive, to add to the scan pattern to make it specific to the > site > under concern. That should, for instance, work for vectorization > scan-tree-dump-times tests. The disadvantage of that approach is > that > the tests become more brittle, as the line numbers would have to be > adjusted whenever the line numbers of the source site change, like > when > new include files, dejagnu directives at the file start, or typedefs > are > needed.
Brainstorming some ideas on other possible approaches on making our tests less brittle; for context I did some investigation back in 2018 about implementing "optimizations remarks" like clang does: diagnostics about optimization decisions, so you could have a dg directive like this on a particular line: foo (); /* { dg-remark "inlined call to 'foo' into 'bar'" } */ which eventually became this series of patches: [PATCH 00/10] RFC: Prototype of compiler-assisted performance analysis https://gcc.gnu.org/legacy-ml/gcc-patches/2018-05/msg01675.html [PATCH] v3 of optinfo, remarks and optimization records https://gcc.gnu.org/legacy-ml/gcc-patches/2018-06/msg01267.html [PATCH 0/2] v4: optinfo framework and remarks https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg00066.html [PATCH 0/5] [RFC v2] Higher-level reporting of vectorization problems https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00446.html where the "remark" idea eventually got dropped in favor of optimization records (compressed json), which landed in GCC 9 as -fsave- optimization-record. Ideas for further approaches: (a) we could revisit adding optimization remarks: perhaps the dump subsystem could be extended so it also reports diagnostics, and we could have DejaGnu directives that check for a remark relating to a particular source line (b) have a script that reads the compressed json and turns it into something that's queryable from DejaGnu tests. This might be more flexible in that it can potentially distinguish between different copies of code (e.g. due to different inlinings sites), but might be less easy to work with in terms of testsuite management. Another idea: perhaps a new dump format for RTL that resembles diagnostics, with line information, and then use per-line dg directives on that so that e.g. you can test that at a particular line we do or don't have some particular construct after a given RTL pass (e.g. that the asm for a particular line does/doesn't match a regex). Hope this is constructive Dave > > Maybe we could get the best of both worlds if we add a new dump > option? > Say, if we make that option add the (for polymorphic languages like > C++: > mangled) name of the current function to each dumped line that is > interesting to scan for. Or just every line, if that's simpler. >