https://gcc.gnu.org/g:7d3aec2a832ef47be547d9426187562e4548bae6
commit r15-7916-g7d3aec2a832ef47be547d9426187562e4548bae6 Author: Jeff Law <j...@ventanamicro.com> Date: Sun Mar 9 14:25:37 2025 -0600 [rtl-optimization/117467] Mark FP destinations as dead The next step in improving ext-dce is to clean up a minor wart in the set/clobber handling code. In that code the safe thing to do is to not process a destination at all. That will leave bits set in the live bitmaps for objects that may no longer be live. Of course with extraneous bits set we use more memory and do more work managing the bitmaps, but it's safe from a code correctness standpoint. One case that is slipping through that we need to fix is scalar fp destinations. Essentially the code never tried to handle those and as a result would leave those entities live and bubble them up through the CFG. In the testcase at hand this takes us from ~10k live objects at entry to ~4k live objects at entry. Time spent in ext-dce goes from 2.14s to .64s. Bootstrapped and regression tested on x86_64. PR rtl-optimization/117467 gcc/ * ext-dce.cc (ext_dce_process_sets): Handle FP destinations better. Diff: --- gcc/ext-dce.cc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index c53dd5b46161..35ddda00cdb6 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -206,8 +206,8 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp) /* We don't support vector destinations or destinations wider than DImode. */ - scalar_int_mode outer_mode; - if (!is_a <scalar_int_mode> (GET_MODE (x), &outer_mode) + scalar_mode outer_mode; + if (!is_a <scalar_mode> (GET_MODE (x), &outer_mode) || GET_MODE_BITSIZE (outer_mode) > HOST_BITS_PER_WIDE_INT) { /* Skip the subrtxs of this destination. There is @@ -239,7 +239,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp) /* The inner mode might be larger, just punt for that case. Remember, we can not just continue to process the inner RTXs due to the STRICT_LOW_PART. */ - if (!is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode) + if (!is_a <scalar_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode) || GET_MODE_BITSIZE (outer_mode) > HOST_BITS_PER_WIDE_INT) { /* Skip the subrtxs of the STRICT_LOW_PART. We can't @@ -293,7 +293,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp) subreg and restart within the SET processing rather than the top of the loop which just complicates the flow even more. */ - if (!is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode) + if (!is_a <scalar_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode) || GET_MODE_BITSIZE (outer_mode) > HOST_BITS_PER_WIDE_INT) { skipped_dest = true;