https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119044
Bug ID: 119044 Summary: 5-16% slowdown of 436.cactusADM since r15-7661-g8293b9e40f12e9 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pheeck at gcc dot gnu.org CC: rguenth at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Host: x86_64-linux Target: x86_64-linux As seen here https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=292.100.0 there was a 10% exec time slowdown of the 436.cactusADM SPEC 2006 benchmark when run with -O2 -march=generic -flto on an AMD Zen2 machine. I bisected it to r15-7661-g8293b9e40f12e9 ee30e2586a3142e63daaf301a561984f1d22d38d is the first bad commit commit ee30e2586a3142e63daaf301a561984f1d22d38d Author: Richard Biener <rguent...@suse.de> Date: Fri Feb 21 09:58:04 2025 +0100 tree-optimization/118954 - avoid UB on ref created by predcom When predicitive commoning moves an invariant ref it makes sure to not build a MEM_REF with a base that is negatively offsetted from an object. But in trying to preserve some transforms it does not consider association of a constant offset with the address computation in DR_BASE_ADDRESS leading to exactly this problem again. This is arguably a problem in data-ref analysis producing such an out-of-bound DR_BASE_ADDRESS, but this looks quite involved to fix, so the following avoids the association in one more case. This fixes the testcase while preserving the desired transform in gcc.dg/tree-ssa/predcom-1.c. PR tree-optimization/118954 * tree-predcom.cc (ref_at_iteration): Make sure to not associate the constant offset with DR_BASE_ADDRESS when that is an offsetted pointer. * gcc.dg/torture/pr118954.c: New testcase. gcc/testsuite/gcc.dg/torture/pr118954.c | 22 ++++++++++++++++++++++ gcc/tree-predcom.cc | 3 ++- 2 files changed, 24 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr118954.c There were also these cactusADM slowdowns in the same timeframe (so probably caused by the same commit): 16% Zen2 -O2 -march=native -flto https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=290.100.0 16% Zen3 -O2 -march=native https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=464.100.0 These aren't regressions against older GCC versions. Btw, there were also some speedups 21% Zen2 -Ofast -march=native https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=301.100.0 13% Zen2 -O2 -march=native https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=291.100.0 >From what I've seen it looks like the speedups balance out the slowdowns, maybe even dominate them. So maybe this isn't an issue? Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)