https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119919

            Bug ID: 119919
           Summary: 7% exchange2 regression between
                    g:6390fc86995fbd5239497cb9e1797a3af51d3936 and
                    g:f72a2d221539cede358f2487b94bc370c6fc44b5
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

this reproduces both on Zen and Intel:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=298.407.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=800.407.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=470.407.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=957.407.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=957.407.0

I think this is likely extra vectorization enabled by fixing costs:

commit 0650ea627399a0ef23db434d4fce6b52b9faf557
Author: Jan Hubicka <hubi...@ucw.cz>
Date:   Tue Apr 22 23:47:14 2025 +0200

    Fix vectorizer costs of COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR, ABSU_EXPR

    this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
    MAX_EXPR, ABS_EXPR and ABSU_EXPR.   We previously costed ABS_EXPR and
ABSU_EXPR
    but it was only correct for FP variant (wehre it corresponds to andss
clearing
    sign bit).  Integer abs/absu is open coded as conditinal move for SSE2 and
    SSE3 instroduced an instruction.

    MIN_EXPR/MAX_EXPR compiles to minss/maxss for FP and accroding to Agner Fog
    tables they costs same as sse_op on all targets. Integer translated to
single
    instruction since SSE3.

    COND_EXPR translated to open-coded conditional move for SSE2, SSE4.1
simplified
    the sequence and AVX512 introduced masked registers.

    gcc/ChangeLog:

            * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Add
special cases
            for COND_EXPR; make MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR more
realistic.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr89618-2.c: XFAIL.

Reply via email to