https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121155

            Bug ID: 121155
           Summary: [16 Regression] 4-6% slowdown of 444.namd since
                    r16-2193-g363b29a9cfbb47
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pheeck at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=476.120.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=798.120.0

there was a 4-6% exec time slowdown of the 444.namd SPEC 2006
benchmark when run with -Ofast -march=native (plus potentially with -flto
and/or -fprofile-use).  I've seen this on AMD Zen{2,3,4} and on Intel Ice Lake
(3rd generation Xeon).
I bisected it to r16-2193-g363b29a9cfbb47 (commited July 11th).

363b29a9cfbb470d6987fb395035c56bae30c64b is the first bad commit
commit 363b29a9cfbb470d6987fb395035c56bae30c64b
Author: Richard Biener <rguent...@suse.de>
Date:   Thu Jul 10 13:30:30 2025 +0200

    properly compute fp/mode for scalar ops for vectorizer costing

    The x86 add_stmt_hook relies on the passed vectype to determine
    the mode and whether it is FP for a scalar operation.  This is
    unreliable now for stmts involving patterns and in the future when
    there is no vector type passed for scalar operations.

    To be least disruptive I've kept using the vector type if it is passed.

            * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Use
            the LHS of a scalar stmt to determine mode and whether it is FP.

 gcc/config/i386/i386.cc | 8 ++++++++
 1 file changed, 8 insertions(+)


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

Reply via email to