On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote:
> Dimitar Dimitrov <dimi...@dinux.eu> writes:
> > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the
> > following RTL for pru-unknown-elf:
> >
> >   (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])
> >           (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855])
> >               (const_int 3 [0x3])))
> >        (nil))
> >   ...
> >   (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
> >           (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])))
> >        (nil))
> >
> > into:
> >
> >   (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
> >           (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0)
> >               (const_int 3 [0x3])))
> >        (nil))
> >
> > That caused libbacktrace build to break for pru-unknown-elf.  Register
> > r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to
> > reject.
> >
> > Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer
> > modes are valid for the hardware subreg.
> >
> > This patch fixes the broken PRU toolchain build.  It leaves only two
> > test case regressions for PRU, caused by rnreg pass renaming a valid
> > paradoxical subreg into an invalid one.
> >   gcc.c-torture/execute/20040709-1.c
> >   gcc.c-torture/execute/20040709-2.c
> >
> >     PR target/119966
> >
> > gcc/ChangeLog:
> >
> >     * emit-rtl.cc (validate_subreg): Validate inner
> >     and outer mode for paradoxical hardware subregs.
> >
> > Co-authored-by: Andrew Pinski <pins...@gmail.com>
> > Signed-off-by: Dimitar Dimitrov <dimi...@dinux.eu>
> > ---
> >  gcc/emit-rtl.cc | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> > index e46b0f9eac4..6c5d9b55508 100644
> > --- a/gcc/emit-rtl.cc
> > +++ b/gcc/emit-rtl.cc
> > @@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
> >        if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> >       && GET_MODE_INNER (imode) == omode)
> >     ;
> > +      else if (!targetm.hard_regno_mode_ok (regno, imode)
> > +          || !targetm.hard_regno_mode_ok (regno, omode))
> > +   return false;
> 
> It isn't meaningful to test regno against omode, since that isn't
> necessarily the register that would be produced by the subreg.

Do you refer to the register renaming pass?  I can't think of another reason
for the regno of a hardware register in a subreg to be changed.

> 
> ISTR that this is a sensitive part of the codebase.  I think there
> are/were targets that create unfoldable subregs for argument passing
> and return.  And I think e500 had unfoldable subregs of FP registers,
> although that port is gone now.

Could you share what is "unfoldable subreg"?  I could not find this phrase
anywhere in the source, except in one comment in the i386 port.

Perhaps a subreg of a hardware register is "unfoldable" when the hardware
register is not valid in the outer mode?  In which case the subreg cannot be
replaced directly with a hardware register?

> 
> So I suppose the question is: when given a hard register, should
> validate_subreg test whether the subreg can be folded to a hard
> register?  Or is it more relaxed than that?  Do we need different
> rules before LRA (which could fix up subregs through reloading)
> and after LRA (where unfoldable subregs stay unfoldable).

My naive answer _was_ that validate_subreg should always perform checks for
hardware registers.  Now I see it was too naive, because I was not aware of
the different ways targets use subregs.  Hence this patch should be dropped.

Meanwhile PR119966, which this patch hoped to address, got fixed instead
with r16-809-gf725d6765373f7.

> 
> If validate_subreg should test whether a subreg of a hard register
> can be folded to a hard register, the fix would be to use
> simplify_subreg_regno instead of the current tests.  But it looks
> like that was deliberately not done.

When validate_subreg was introduced with r0-63800-gbeb72684810c1a,
simplify_subreg_regno simply did not exit.  The simplify_subreg_regno
itself was added later with r0-89444-geef302d277ea42.

> 
> It might still be worth trying to use simplify_subreg_regno and
> seeing what breaks.  Any fallaout would at least let us expand
> the comments to explain the constraints.

I tried simplify_subreg_regno, and some tests regressed for x86_64-pc-linux-gnu:

  check-gcc-c                     trunk               patched     change
  -------------------------------+-------------------+------------+-----
  # of expected passes            216431              216358      -73
  # of unexpected failures        117                 197         +80
  # of unexpected successes       25                  25          0
  # of expected failures          1552                1552        0
  # of unresolved testcases       0                   33          +33
  # of unsupported tests          3888                3888        0

For reference, here are a few of the newly failing tests:
 FAIL: gcc.dg/asan/pr82517.c   -O0  (internal compiler error: in gen_reg_rtx, 
at emit-rtl.cc:1188)
 FAIL: c-c++-common/cpp/embed-28.c  -Wc++-compat  (internal compiler error: in 
gen_reg_rtx, at emit-rtl.cc:1188)
 FAIL: gcc.dg/torture/stackalign/pr16660-3.c   -Os -mforce-drap -fpic (internal 
compiler error: in gen_reg_rtx, at emit-rtl.cc:1188)

The stack pointer register is explicitly rejected by simplify_subreg_regno.
So gen_lowpart fails to return a regno for 'sp' in config/i386/i386.md:28561:

  ;; Attempt to always use XOR for zeroing registers (including FP modes).
  (define_peephole2
    [(set (match_operand 0 "general_reg_operand")
          (match_operand 1 "const0_operand"))]
    "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
     && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
     && peep2_regno_dead_p (0, FLAGS_REG)"
    [(parallel [(set (match_dup 0) (const_int 0))
                (clobber (reg:CC FLAGS_REG))])]
    "operands[0] = gen_lowpart (word_mode, operands[0]);")

peephole2 tries to generate the following, but gen_lowpart cannot create the
subreg:

  (insn 29 8 12 2 (parallel [
              (set (subreg:DI (reg/v:SI 7 sp [ a ]) 0)
                  (const_int 0 [0]))
              (clobber (reg:CC 17 flags))
          ])

ICE backtrace:
  .../gcc/gcc/testsuite/gcc.dg/asan/pr82517.c:34:1: internal compiler error: in 
gen_reg_rtx, at emit-rtl.cc:1188
  0x22d3d1f internal_error(char const*, ...)
          
/home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic-global-context.cc:517
  0x669157 fancy_abort(char const*, int, char const*)
          /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic.cc:1815
  0x46df3d gen_reg_rtx(machine_mode)
          /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:1188
  0x946c80 copy_to_reg(rtx_def*)
          /home/dinux/projects/pru/local-workspace/gcc/gcc/explow.cc:622
  0xd5c64f gen_lowpart_general(machine_mode, rtx_def*)
          /home/dinux/projects/pru/local-workspace/gcc/gcc/rtlhooks.cc:56
  0x1998ca9 gen_split_318(rtx_insn*, rtx_def**)
          
/home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:12877
  0x1d8a2b9 split_insns(rtx_def*, rtx_insn*)
          
/home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:20885
  0x930f8b try_split(rtx_def*, rtx_insn*, int)
          /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:3952
  0xd18ab5 split_insn
          /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3479
  0xd1e877 split_all_insns()
          /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3583
  0xd1e928 execute
          /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:4507

Thanks,
Dimitar

> 
> Thanks,
> Richard

Reply via email to