https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116564

Alex Coplan <acoplan at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #5 from Alex Coplan <acoplan at gcc dot gnu.org> ---
So I did some debugging of this issue.  I think the problem is a disagreement
between the dce.cc:fast_dce code and the code in
df-problems.cc:df_lr_bb_local_compute.  Specifically, they disagree on the
treatment of partial clobbers.

Eliding the large block comment for readability, the code in
df_lr_bb_local_compute does:

      FOR_EACH_INSN_INFO_DEF (def, insn_info)
        {
          unsigned int dregno = DF_REF_REGNO (def);
          bitmap_set_bit (&bb_info->def, dregno);
          if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
            bitmap_set_bit (&bb_info->use, dregno);
          else
            bitmap_clear_bit (&bb_info->use, dregno);
        }

for each insn.  So in the case of:

(insn 10 8 13 3 (clobber (subreg:V1DF (reg/v:V2x1DF 104 [ __val ]) 8)) -1
     (nil))

we treat this is a use of r104, since the corresponding DF def has the
DF_REF_PARTIAL flag set.  However, the logic in dce.cc:dce_process_block
(called from fast_dce) has no such provision for partial defs. The logic there
is instead:

  FOR_BB_INSNS_REVERSE (bb, insn)
    if (DEBUG_INSN_P (insn))
      [...]
    else if (INSN_P (insn))
      {
        bool needed = marked_insn_p (insn);

        /* The insn is needed if there is someone who uses the output.  */
        if (!needed)
          FOR_EACH_INSN_DEF (def, insn)
            if (bitmap_bit_p (local_live, DF_REF_REGNO (def))
                || bitmap_bit_p (au, DF_REF_REGNO (def)))
              {
                needed = true;
                mark_insn (insn, true);
                break;
              }

        /* No matter if the instruction is needed or not, we remove
           any regno in the defs from the live set.  */
        df_simulate_defs (insn, local_live);

        /* On the other hand, we do not allow the dead uses to set
           anything in local_live.  */
        if (needed)
          df_simulate_uses (insn, local_live);

        [...]
      }

where df_simulate_defs is just:

void
df_simulate_defs (rtx_insn *insn, bitmap live)
{
  df_ref def;

  FOR_EACH_INSN_DEF (def, insn)
    {
      unsigned int dregno = DF_REF_REGNO (def);

      /* If the def is to only part of the reg, it does
         not kill the other defs that reach here.  */
      if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
        bitmap_clear_bit (live, dregno);
    }
}

so the main loop in dce_process_block doesn't count i10 above as using r104, it
doesn't make r104 live.

The "did something happen" condition in dce_process_block is:

  block_changed = !bitmap_equal_p (local_live, DF_LR_IN (bb));
  if (block_changed)
    bitmap_copy (DF_LR_IN (bb), local_live);

  BITMAP_FREE (local_live);
  return block_changed;

so we compare what we computed locally against the DF_LR_IN set.  Due to the
above-mentioned discrepancy in the treatment of partial clobbers, for the above
testcase, DF_LR_IN has r104 set, but local_live doesn't, hence we update
DF_LR_IN to remove r104, and return true.

Eventually we enter the if (global_changed) block in fast_dce, which does:

          /* We do not need to rescan any instructions.  We only need
             to redo the dataflow equations for the blocks that had a
             change at the top of the block.  Then we need to redo the
             iteration.  */
          if (word_level)
            df_analyze_problem (df_word_lr, all_blocks, postorder, n_blocks);
          else
            df_analyze_problem (df_lr, all_blocks, postorder, n_blocks);

which ultimately calls df_lr_bb_local_compute again, thus re-adding r104 to the
DF_LR_IN sets.  This cycle causes us to spin indefinitely (removing r104 in
fast_dce and having it re-added by df_analyze_problem ->
df_lr_bb_local_compute).

I wonder if the code in df_simulate_defs ought not to be updated to match the
treatment in df_lr_bb_local_compute (i.e. to set the live bit for partial
defs).  CCing Richard S as it looks like he last touched the relevant code in
df_lr_bb_local_compute.

Reply via email to