On Sun, Jun 08, 2025 at 09:09:44AM -0600, Jeff Law wrote:
> 
> 
> On 6/5/25 2:16 PM, Dimitar Dimitrov wrote:
> > PR119966 showed that combine could generate unfoldable hardware subregs
> > for pru-unknown-elf.  To fix, strengthen the checks performed by
> > validate_subreg.
> > 
> > The simplify_subreg_regno performs more validity checks than
> > the simple info.representable_p.  Most importantly, the
> > targetm.hard_regno_mode_ok hook is called to ensure the hardware
> > register is valid in subreg's outer mode.  This fixes the rootcause
> > for PR119966.
> > 
> > The checks for stack-related registers are bypassed because the i386
> > backend generates them, in this seemingly valid peephole optimization:
> > 
> >     ;; Attempt to always use XOR for zeroing registers (including FP modes).
> >     (define_peephole2
> >       [(set (match_operand 0 "general_reg_operand")
> >             (match_operand 1 "const0_operand"))]
> >       "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
> >        && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
> >        && peep2_regno_dead_p (0, FLAGS_REG)"
> >       [(parallel [(set (match_dup 0) (const_int 0))
> >                   (clobber (reg:CC FLAGS_REG))])]
> >       "operands[0] = gen_lowpart (word_mode, operands[0]);")
> > 
> > Testing done:
> >    * No regressions were detected for C and C++ on x86_64-pc-linux-gnu.
> >    * "contrib/compare-all-tests i386" showed no difference in code
> >      generation.
> >    * No regressions for pru-unknown-elf.
> >    * Reverted r16-809-gf725d6765373f7 to expose the now latent PR119966.
> >      Then ensured pru-unknown-elf build is ok.  Only two cases regressed
> >      where rnreg pass transforms a valid hardware subreg into invalid
> >      one.  But I think that is not related to combine's PR119966:
> >        gcc.c-torture/execute/20040709-1.c
> >        gcc.c-torture/execute/20040709-2.c
> > 
> > This patch was provisionally approved in:
> >    https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685492.html
> > I'll wait for 2 days to get pre-commit CI results, then will commit it.
> > 
> >     PR target/119966
> > 
> > gcc/ChangeLog:
> > 
> >     * emit-rtl.cc (validate_subreg): Call simplify_subreg_regno
> >     instead of checking info.representable_p..
> >     * rtl.h (simplify_subreg_regno): Add new argument
> >     allow_stack_regs.
> >     * rtlanal.cc (simplify_subreg_regno): Do not reject
> >     stack-related registers if allow_stack_regs is true.
> Note that my system is primarily post-commit unless I
> explicitly throw in a patch for testing purposes.  So
> figure ~24hrs after the patch goes in we'll have the
> vast majority of the coverage.  The native emulated
> targets will trickle in over the remainder of the week.
> 

I did more testing.  riscv32-unknown-elf and riscv64-unknown-linux-gnu
are ok.  But this patch breaks AVR due to the following workaround for
old reload in avr_hard_regno_mode_ok:

  if (GET_MODE_SIZE (mode) >= 4
      && regno >= REG_X
      // This problem only concerned the old reload.
      && ! avropt_lra_p)
    return false;

The following insn causes a split for "zero_extendsidi2" to ICE because
operands are replaced with output from simplify_gen_subreg, which now
fails:

  (insn 17 16 18 2 (set (reg/v:DI 24 r24 [orig:44 lowD.4526 ] [44])
          (zero_extend:DI (reg:SI 22 r22 [orig:66 aD.4519 ] [66]))) 
"/mnt/nvme/dinux/local-workspace/gcc/libgcc/fixed-bit.c":790:7 827 
{zero_extendsidi2}
       (nil))

That should be fixed once AVR is switched to LRA (PR113934).

Richard, Jeff, it does not seem appropriate to merge this patch now,
given that it breaks avr and or1k.  Let me know if such experiment is
desired despite the known breakages.

Regards,
Dimitar

> Jeff
> 

Reply via email to