https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932

--- Comment #19 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <tnfch...@gcc.gnu.org>:

https://gcc.gnu.org/g:9403b035befe3537c343f7430e321468c0f2c28b

commit r15-6105-g9403b035befe3537c343f7430e321468c0f2c28b
Author: Tamar Christina <tamar.christ...@arm.com>
Date:   Wed Dec 11 11:47:49 2024 +0000

    middle-end: use two's complement equality when comparing IVs during
candidate selection  [PR114932]

    IVOPTS normally uses affine trees to perform comparisons between different
IVs,
    but these seem to have been missing in two key spots and instead normal
tree
    equivalencies used.

    In some cases where we have a two-complements equivalence but not a strict
    signedness equivalencies we end up generating both a signed and unsigned IV
for
    the same candidate.

    This patch implements a new OEP flag called OEP_ASSUME_WRAPV.  This flag
will
    check if the operands would produce the same bit values after the
computations
    even if the final sign is different.

    This happens quite a lot with Fortran but can also happen in C because this
came
    code is unable to figure out when one expression is a multiple of another.

    As an example in the attached testcase we get:

    Initial set of candidates:
      cost: 24 (complexity 3)
      reg_cost: 9
      cand_cost: 15
      cand_group_cost: 0 (complexity 3)
      candidates: 1, 6, 8
       group:0 --> iv_cand:6, cost=(0,1)
       group:1 --> iv_cand:1, cost=(0,0)
       group:2 --> iv_cand:8, cost=(0,1)
       group:3 --> iv_cand:8, cost=(0,1)
      invariant variables: 6
      invariant expressions: 1, 2

    <Invariant Expressions>:
    inv_expr 1:     stride.3_27 * 4
    inv_expr 2:     (unsigned long) stride.3_27 * 4

    These end up being used in the same group:

    Group 1:
    cand  cost    compl.  inv.expr.       inv.vars
    1     0       0       NIL;    6
    2     0       0       NIL;    6
    3     0       0       NIL;    6

    which ends up with IV opts picking the signed and unsigned IVs:

    Improved to:
      cost: 24 (complexity 3)
      reg_cost: 9
      cand_cost: 15
      cand_group_cost: 0 (complexity 3)
      candidates: 1, 6, 8
       group:0 --> iv_cand:6, cost=(0,1)
       group:1 --> iv_cand:1, cost=(0,0)
       group:2 --> iv_cand:8, cost=(0,1)
       group:3 --> iv_cand:8, cost=(0,1)
      invariant variables: 6
      invariant expressions: 1, 2

    and so generates the same IV as both signed and unsigned:

    ;;   basic block 21, loop depth 3, count 214748368 (estimated locally, freq
58.2545), maybe hot
    ;;    prev block 28, next block 31, flags: (NEW, REACHABLE, VISITED)
    ;;    pred:       28 [always]  count:23622320 (estimated locally, freq
6.4080) (FALLTHRU,EXECUTABLE)
    ;;                25 [always]  count:191126046 (estimated locally, freq
51.8465) (FALLTHRU,DFS_BACK,EXECUTABLE)
      # .MEM_66 = PHI <.MEM_34(28), .MEM_22(25)>
      # ivtmp.22_41 = PHI <0(28), ivtmp.22_82(25)>
      # ivtmp.26_51 = PHI <ivtmp.26_55(28), ivtmp.26_72(25)>
      # ivtmp.28_90 = PHI <ivtmp.28_99(28), ivtmp.28_98(25)>

    ...

    ;;   basic block 24, loop depth 3, count 214748366 (estimated locally, freq
58.2545), maybe hot
    ;;    prev block 22, next block 25, flags: (NEW, REACHABLE, VISITED)'
    ;;    pred:       22 [always]  count:95443719 (estimated locally, freq
25.8909) (FALLTHRU)
    ;;                21 [33.3% (guessed)]  count:71582790 (estimated locally,
freq 19.4182) (TRUE_VALUE,EXECUTABLE)
    ;;                31 [33.3% (guessed)]  count:47721860 (estimated locally,
freq 12.9455) (TRUE_VALUE,EXECUTABLE)
    # .MEM_22 = PHI <.MEM_44(22), .MEM_31(21), .MEM_79(31)>
    ivtmp.22_82 = ivtmp.22_41 + 1;
    ivtmp.26_72 = ivtmp.26_51 + _80;
    ivtmp.28_98 = ivtmp.28_90 + _39;

    These two IVs are always used as unsigned, so IV ops generates:

      _73 = stride.3_27 * 4;
      _80 = (unsigned long) _73;
      _54 = (unsigned long) stride.3_27;
      _39 = _54 * 4;

    Which means that in e.g. exchange2 we generate a lot of duplicate code.

    This is because candidate 6 and 8 are equivalent under two's complement but
have
    different signs.

    This patch changes it so that if you have two IVs that are affine
equivalent to
    just pick one over the other.  IV already has code for this, so the patch
just
    uses affine trees instead of tree for the check.

    With it we get:

    <Invariant Expressions>:
    inv_expr 1:     stride.3_27 * 4

    <Group-candidate Costs>:
    Group 0:
      cand  cost    compl.  inv.expr.       inv.vars
      5     0       2       NIL;    NIL;
      6     0       3       NIL;    NIL;

    Group 1:
      cand  cost    compl.  inv.expr.       inv.vars
      1     0       0       NIL;    6
      2     0       0       NIL;    6
      3     0       0       NIL;    6
      4     0       0       NIL;    6

    Initial set of candidates:
      cost: 16 (complexity 3)
      reg_cost: 6
      cand_cost: 10
      cand_group_cost: 0 (complexity 3)
      candidates: 1, 6
       group:0 --> iv_cand:6, cost=(0,3)
       group:1 --> iv_cand:1, cost=(0,0)
      invariant variables: 6
      invariant expressions: 1

    gcc/ChangeLog:

            PR tree-optimization/114932
            * fold-const.cc (operand_compare::operand_equal_p): Use it.
            (operand_compare::verify_hash_value): Likewise.
            (operand_compare::hash_operand): Likewise.
            (test_operand_equality::test): New.
            (fold_const_cc_tests): Use it.
            * tree-core.h (enum operand_equal_flag): Add OEP_ASSUME_WRAPV.
            * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural
eq.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/114932
            * gfortran.dg/addressing-modes_2.f90: New test.

Reply via email to