https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119919
Bug ID: 119919 Summary: 7% exchange2 regression between g:6390fc86995fbd5239497cb9e1797a3af51d3936 and g:f72a2d221539cede358f2487b94bc370c6fc44b5 Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hubicka at gcc dot gnu.org Target Milestone: --- this reproduces both on Zen and Intel: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=298.407.0 https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=800.407.0 https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=470.407.0 https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=957.407.0 https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=957.407.0 I think this is likely extra vectorization enabled by fixing costs: commit 0650ea627399a0ef23db434d4fce6b52b9faf557 Author: Jan Hubicka <hubi...@ucw.cz> Date: Tue Apr 22 23:47:14 2025 +0200 Fix vectorizer costs of COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR, ABSU_EXPR this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and ABSU_EXPR but it was only correct for FP variant (wehre it corresponds to andss clearing sign bit). Integer abs/absu is open coded as conditinal move for SSE2 and SSE3 instroduced an instruction. MIN_EXPR/MAX_EXPR compiles to minss/maxss for FP and accroding to Agner Fog tables they costs same as sse_op on all targets. Integer translated to single instruction since SSE3. COND_EXPR translated to open-coded conditional move for SSE2, SSE4.1 simplified the sequence and AVX512 introduced masked registers. gcc/ChangeLog: * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Add special cases for COND_EXPR; make MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR more realistic. gcc/testsuite/ChangeLog: * gcc.target/i386/pr89618-2.c: XFAIL.