https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116564
Alex Coplan <acoplan at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rsandifo at gcc dot gnu.org --- Comment #5 from Alex Coplan <acoplan at gcc dot gnu.org> --- So I did some debugging of this issue. I think the problem is a disagreement between the dce.cc:fast_dce code and the code in df-problems.cc:df_lr_bb_local_compute. Specifically, they disagree on the treatment of partial clobbers. Eliding the large block comment for readability, the code in df_lr_bb_local_compute does: FOR_EACH_INSN_INFO_DEF (def, insn_info) { unsigned int dregno = DF_REF_REGNO (def); bitmap_set_bit (&bb_info->def, dregno); if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) bitmap_set_bit (&bb_info->use, dregno); else bitmap_clear_bit (&bb_info->use, dregno); } for each insn. So in the case of: (insn 10 8 13 3 (clobber (subreg:V1DF (reg/v:V2x1DF 104 [ __val ]) 8)) -1 (nil)) we treat this is a use of r104, since the corresponding DF def has the DF_REF_PARTIAL flag set. However, the logic in dce.cc:dce_process_block (called from fast_dce) has no such provision for partial defs. The logic there is instead: FOR_BB_INSNS_REVERSE (bb, insn) if (DEBUG_INSN_P (insn)) [...] else if (INSN_P (insn)) { bool needed = marked_insn_p (insn); /* The insn is needed if there is someone who uses the output. */ if (!needed) FOR_EACH_INSN_DEF (def, insn) if (bitmap_bit_p (local_live, DF_REF_REGNO (def)) || bitmap_bit_p (au, DF_REF_REGNO (def))) { needed = true; mark_insn (insn, true); break; } /* No matter if the instruction is needed or not, we remove any regno in the defs from the live set. */ df_simulate_defs (insn, local_live); /* On the other hand, we do not allow the dead uses to set anything in local_live. */ if (needed) df_simulate_uses (insn, local_live); [...] } where df_simulate_defs is just: void df_simulate_defs (rtx_insn *insn, bitmap live) { df_ref def; FOR_EACH_INSN_DEF (def, insn) { unsigned int dregno = DF_REF_REGNO (def); /* If the def is to only part of the reg, it does not kill the other defs that reach here. */ if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) bitmap_clear_bit (live, dregno); } } so the main loop in dce_process_block doesn't count i10 above as using r104, it doesn't make r104 live. The "did something happen" condition in dce_process_block is: block_changed = !bitmap_equal_p (local_live, DF_LR_IN (bb)); if (block_changed) bitmap_copy (DF_LR_IN (bb), local_live); BITMAP_FREE (local_live); return block_changed; so we compare what we computed locally against the DF_LR_IN set. Due to the above-mentioned discrepancy in the treatment of partial clobbers, for the above testcase, DF_LR_IN has r104 set, but local_live doesn't, hence we update DF_LR_IN to remove r104, and return true. Eventually we enter the if (global_changed) block in fast_dce, which does: /* We do not need to rescan any instructions. We only need to redo the dataflow equations for the blocks that had a change at the top of the block. Then we need to redo the iteration. */ if (word_level) df_analyze_problem (df_word_lr, all_blocks, postorder, n_blocks); else df_analyze_problem (df_lr, all_blocks, postorder, n_blocks); which ultimately calls df_lr_bb_local_compute again, thus re-adding r104 to the DF_LR_IN sets. This cycle causes us to spin indefinitely (removing r104 in fast_dce and having it re-added by df_analyze_problem -> df_lr_bb_local_compute). I wonder if the code in df_simulate_defs ought not to be updated to match the treatment in df_lr_bb_local_compute (i.e. to set the live bit for partial defs). CCing Richard S as it looks like he last touched the relevant code in df_lr_bb_local_compute.