On 6/5/25 2:16 PM, Dimitar Dimitrov wrote:
PR119966 showed that combine could generate unfoldable hardware subregs
for pru-unknown-elf.  To fix, strengthen the checks performed by
validate_subreg.

The simplify_subreg_regno performs more validity checks than
the simple info.representable_p.  Most importantly, the
targetm.hard_regno_mode_ok hook is called to ensure the hardware
register is valid in subreg's outer mode.  This fixes the rootcause
for PR119966.

The checks for stack-related registers are bypassed because the i386
backend generates them, in this seemingly valid peephole optimization:

    ;; Attempt to always use XOR for zeroing registers (including FP modes).
    (define_peephole2
      [(set (match_operand 0 "general_reg_operand")
            (match_operand 1 "const0_operand"))]
      "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
       && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
       && peep2_regno_dead_p (0, FLAGS_REG)"
      [(parallel [(set (match_dup 0) (const_int 0))
                  (clobber (reg:CC FLAGS_REG))])]
      "operands[0] = gen_lowpart (word_mode, operands[0]);")

Testing done:
   * No regressions were detected for C and C++ on x86_64-pc-linux-gnu.
   * "contrib/compare-all-tests i386" showed no difference in code
     generation.
   * No regressions for pru-unknown-elf.
   * Reverted r16-809-gf725d6765373f7 to expose the now latent PR119966.
     Then ensured pru-unknown-elf build is ok.  Only two cases regressed
     where rnreg pass transforms a valid hardware subreg into invalid
     one.  But I think that is not related to combine's PR119966:
       gcc.c-torture/execute/20040709-1.c
       gcc.c-torture/execute/20040709-2.c

This patch was provisionally approved in:
   https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685492.html
I'll wait for 2 days to get pre-commit CI results, then will commit it.

        PR target/119966

gcc/ChangeLog:

        * emit-rtl.cc (validate_subreg): Call simplify_subreg_regno
        instead of checking info.representable_p..
        * rtl.h (simplify_subreg_regno): Add new argument
        allow_stack_regs.
        * rtlanal.cc (simplify_subreg_regno): Do not reject
        stack-related registers if allow_stack_regs is true.
Note that my system is primarily post-commit unless I explicitly throw in a patch for testing purposes. So figure ~24hrs after the patch goes in we'll have the vast majority of the coverage. The native emulated targets will trickle in over the remainder of the week.

Jeff

Reply via email to