https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116564

--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-14 branch has been updated by Alex Coplan
<acop...@gcc.gnu.org>:

https://gcc.gnu.org/g:23108ab2bdea84beeddf0b8d3b10f4251d3e1686

commit r14-11454-g23108ab2bdea84beeddf0b8d3b10f4251d3e1686
Author: Alex Coplan <alex.cop...@arm.com>
Date:   Mon Mar 10 16:44:15 2025 +0000

    df: Treat partial defs as uses in df_simulate_defs [PR116564]

    The PR shows us spinning in dce.cc:fast_dce at the start of combine.
    This spinning appears to be because of a disagreement between the fast_dce
code
    and the code in df-problems.cc:df_lr_bb_local_compute.  Specifically, they
    disagree on the treatment of partial defs.  For the testcase in the PR, we
have
    the following insn in bb 3:

    (insn 10 8 13 3 (clobber (subreg:V1DF (reg/v:V2x1DF 104 [ __val ]) 8)) -1
         (nil))

    which gives rise to a DF def with DF_REF_FLAGS = 0x8b0, i.e.
    DF_REF_PARTIAL | DF_REF_READ_WRITE | DF_REF_MUST_CLOBBER | DF_REF_SUBREG.

    Eliding the large block comment for readability, the code in
    df_lr_bb_local_compute does the following (for each insn):

          FOR_EACH_INSN_INFO_DEF (def, insn_info)
            {
              unsigned int dregno = DF_REF_REGNO (def);
              bitmap_set_bit (&bb_info->def, dregno);
              if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
                bitmap_set_bit (&bb_info->use, dregno);
              else
                bitmap_clear_bit (&bb_info->use, dregno);
            }

    i.e. it models partial defs as a RMW operation; thus for the def arising
    from i10 above, it records a use of r104; hence it ends up in the
    live-in set for bb 3.

    However, as it stands, the code in dce.cc:fast_dce (and its callee
    dce_process_block) has no such provision for DF_REF_PARTIAL defs.  It
    does not treat these as a RMW and does not compute r104 above as being
    live-in to bb 3.  At the end of dce_process_block we compute the
    following "did something happen" condition used to decide termination of
    the analysis:

      block_changed = !bitmap_equal_p (local_live, DF_LR_IN (bb));
      if (block_changed)
        bitmap_copy (DF_LR_IN (bb), local_live);

      BITMAP_FREE (local_live);
      return block_changed;

    because of the disagreement between df_lr_local_compute and the local
    analysis done by fast_dce, we invariably have r104 in DF_LR_IN, but not
    in local_live.  Hence we always return true here, call
    df_analyze_problem (which re-computes DF_LR_IN according to
    df_lr_bb_local_compute, re-adding r104), and so the analysis never
    terminates.

    This patch therefore adjusts df_simulate_defs (called from
    dce_process_block) to match the behaviour of df_lr_bb_local_compute in
    this respect, namely we make it model partial defs as RMW operations by
    setting the relevant register live.  This fixes the spinning in fast_dce
    for this testcase.

    gcc/ChangeLog:

            PR rtl-optimization/116564
            * df-problems.cc (df_simulate_defs): For partial defs, mark the
            register live (treat it as a RMW operation).

    gcc/testsuite/ChangeLog:

            PR rtl-optimization/116564
            * gcc.target/aarch64/torture/pr116564.c: New test.

    (cherry picked from commit 758e617bcf224dc9d4a7e26dd858d43c1e63b916)

Reply via email to