https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121155
Bug ID: 121155 Summary: [16 Regression] 4-6% slowdown of 444.namd since r16-2193-g363b29a9cfbb47 Product: gcc Version: 16.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pheeck at gcc dot gnu.org CC: rguenth at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu As seen here https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=476.120.0 https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=798.120.0 there was a 4-6% exec time slowdown of the 444.namd SPEC 2006 benchmark when run with -Ofast -march=native (plus potentially with -flto and/or -fprofile-use). I've seen this on AMD Zen{2,3,4} and on Intel Ice Lake (3rd generation Xeon). I bisected it to r16-2193-g363b29a9cfbb47 (commited July 11th). 363b29a9cfbb470d6987fb395035c56bae30c64b is the first bad commit commit 363b29a9cfbb470d6987fb395035c56bae30c64b Author: Richard Biener <rguent...@suse.de> Date: Thu Jul 10 13:30:30 2025 +0200 properly compute fp/mode for scalar ops for vectorizer costing The x86 add_stmt_hook relies on the passed vectype to determine the mode and whether it is FP for a scalar operation. This is unreliable now for stmts involving patterns and in the future when there is no vector type passed for scalar operations. To be least disruptive I've kept using the vector type if it is passed. * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Use the LHS of a scalar stmt to determine mode and whether it is FP. gcc/config/i386/i386.cc | 8 ++++++++ 1 file changed, 8 insertions(+) Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)