https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120957

            Bug ID: 120957
           Summary: [16 Regression] 6-9% slowdown of 503.bwaves_r on
                    Zen{2,3} since r16-1647-gc06979ff957485
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pheeck at gcc dot gnu.org
                CC: liuhongt at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=295.427.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=471.427.0

there was a 6% exec time slowdown of the 503.bwaves SPEC 2017
benchmark when run with -Ofast -march=native on an AMD Zen2 machine, 9%
slowdown on an AMD Zen3 machine.
I bisected this to r16-1647-gc06979ff957485 (2025-06-24).

c06979ff95748559da0c2d3aa4eda9d5999eaaf6 is the first bad commit
commit c06979ff95748559da0c2d3aa4eda9d5999eaaf6
Author: hongtao.liu <hongtao....@intel.com>
Date:   Wed Mar 5 12:25:32 2025 +0100

    Don't duplicate setup code cost when do group-candidate cost calucalution.

    -  /* Uses in a group can share setup code, so only add setup cost once. 
*/
    -  cost -= cost.scratch;

    It looks like the original code took into account avoiding double
    counting, but unfortunately cost is reset inside the follow loop which
    invalidates the upper code, and makes same setup code cost duplicated in
    each use of the group.

    The patch fix the issue. It can also improve 548.exchange_r by 6% with
    -march=x86-64-v3 -O2 due to better ivopt on EMR.

    No big performance impact for SPEC2017 on graviton4/SPR with -mcpu=native
    -Ofast -fomit-framepointer -flto=auto.

    gcc/ChangeLog:

            PR target/115842
            * tree-ssa-loop-ivopts.cc (determine_group_iv_cost_address):
            Don't recalculate inv_expr when group-candidate cost
            calucalution.

 gcc/tree-ssa-loop-ivopts.cc | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)
bisect found first bad commit
Connection to tiber.arch.suse.cz closed.


This is a regression against GCC 15. See the comparison (Zen2)
here:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1070.427.0&plot.1=1219.427.0&plot.2=295.427.0&;


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

Reply via email to