https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119044

            Bug ID: 119044
           Summary: 5-16% slowdown of 436.cactusADM since
                    r15-7661-g8293b9e40f12e9
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pheeck at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=292.100.0

there was a 10% exec time slowdown of the 436.cactusADM SPEC 2006 benchmark
when run with -O2 -march=generic -flto on an AMD Zen2 machine.  I bisected it
to r15-7661-g8293b9e40f12e9

ee30e2586a3142e63daaf301a561984f1d22d38d is the first bad commit
commit ee30e2586a3142e63daaf301a561984f1d22d38d
Author: Richard Biener <rguent...@suse.de>
Date:   Fri Feb 21 09:58:04 2025 +0100

    tree-optimization/118954 - avoid UB on ref created by predcom

    When predicitive commoning moves an invariant ref it makes sure to
    not build a MEM_REF with a base that is negatively offsetted from
    an object.  But in trying to preserve some transforms it does not
    consider association of a constant offset with the address computation
    in DR_BASE_ADDRESS leading to exactly this problem again.  This is
    arguably a problem in data-ref analysis producing such an out-of-bound
    DR_BASE_ADDRESS, but this looks quite involved to fix, so the
    following avoids the association in one more case.  This fixes the
    testcase while preserving the desired transform in
    gcc.dg/tree-ssa/predcom-1.c.

            PR tree-optimization/118954
            * tree-predcom.cc (ref_at_iteration): Make sure to not
            associate the constant offset with DR_BASE_ADDRESS when
            that is an offsetted pointer.

            * gcc.dg/torture/pr118954.c: New testcase.

 gcc/testsuite/gcc.dg/torture/pr118954.c | 22 ++++++++++++++++++++++
 gcc/tree-predcom.cc                     |  3 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr118954.c

There were also these cactusADM slowdowns in the same timeframe (so probably
caused by the same commit):

16% Zen2 -O2 -march=native -flto
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=290.100.0

16% Zen3 -O2 -march=native
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=464.100.0

These aren't regressions against older GCC versions.

Btw, there were also some speedups

21% Zen2 -Ofast -march=native
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=301.100.0

13% Zen2 -O2 -march=native
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=291.100.0

>From what I've seen it looks like the speedups balance out the slowdowns, maybe
even dominate them.  So maybe this isn't an issue?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

Reply via email to