[Bug tree-optimization/120751] New: [16 Regression] 10-15% slowdown of 454.calculix on Zen4 and Zen5 since r16-1001-g0291f53f8d2343

pheeck at gcc dot gnu.org via Gcc-bugs Sat, 21 Jun 2025 14:40:37 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120751


            Bug ID: 120751
           Summary: [16 Regression] 10-15% slowdown of 454.calculix on
                    Zen4 and Zen5 since r16-1001-g0291f53f8d2343
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pheeck at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1101.170.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1240.170.0

there was a 10-15% exec time slowdown of 454.calculix SPEC 2006
benchmark when run with -O2 -march=x86-64-v3 (or -march=native) -flto on a
Zen4/Zen5 machine.
I bisected it to r16-1001-g0291f53f8d2343.

0291f53f8d2343ca0d39589ebffc31d9c328d6ab is the first bad commit
commit 0291f53f8d2343ca0d39589ebffc31d9c328d6ab
Author: Richard Biener <rguent...@suse.de>
Date:   Fri May 30 08:54:10 2025 +0200

    tree-optimization/120457 - avoid lowering of some single-element interleave

    The following makes sure we are not lowering single-element interleaving
    schemes in a way that defeats load vectorizing later but allows the
    VMAT_ELEMENTWISE fallback to be used.

            PR tree-optimization/120457
            * tree-vect-slp.cc (vect_lower_load_permutations): Implement
            the same heuristics as load vectorization for single-element
            interleaving that spans multiple vectors.

 gcc/tree-vect-slp.cc | 9 +++++++++
 1 file changed, 9 insertions(+)


This is a regression against GCC 15. See the comparison
here:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1104.170.0&plot.1=1144.170.0&plot.2=1101.170.0&;


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug tree-optimization/120751] New: [16 Regression] 10-15% slowdown of 454.calculix on Zen4 and Zen5 since r16-1001-g0291f53f8d2343

Reply via email to