On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote:
> Dimitar Dimitrov <[email protected]> writes:
> > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the
> > following RTL for pru-unknown-elf:
> >
> > (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])
> > (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855])
> > (const_int 3 [0x3])))
> > (nil))
> > ...
> > (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
> > (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])))
> > (nil))
> >
> > into:
> >
> > (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
> > (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0)
> > (const_int 3 [0x3])))
> > (nil))
> >
> > That caused libbacktrace build to break for pru-unknown-elf. Register
> > r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to
> > reject.
> >
> > Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer
> > modes are valid for the hardware subreg.
> >
> > This patch fixes the broken PRU toolchain build. It leaves only two
> > test case regressions for PRU, caused by rnreg pass renaming a valid
> > paradoxical subreg into an invalid one.
> > gcc.c-torture/execute/20040709-1.c
> > gcc.c-torture/execute/20040709-2.c
> >
> > PR target/119966
> >
> > gcc/ChangeLog:
> >
> > * emit-rtl.cc (validate_subreg): Validate inner
> > and outer mode for paradoxical hardware subregs.
> >
> > Co-authored-by: Andrew Pinski <[email protected]>
> > Signed-off-by: Dimitar Dimitrov <[email protected]>
> > ---
> > gcc/emit-rtl.cc | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> > index e46b0f9eac4..6c5d9b55508 100644
> > --- a/gcc/emit-rtl.cc
> > +++ b/gcc/emit-rtl.cc
> > @@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
> > if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > && GET_MODE_INNER (imode) == omode)
> > ;
> > + else if (!targetm.hard_regno_mode_ok (regno, imode)
> > + || !targetm.hard_regno_mode_ok (regno, omode))
> > + return false;
>
> It isn't meaningful to test regno against omode, since that isn't
> necessarily the register that would be produced by the subreg.
Do you refer to the register renaming pass? I can't think of another reason
for the regno of a hardware register in a subreg to be changed.
>
> ISTR that this is a sensitive part of the codebase. I think there
> are/were targets that create unfoldable subregs for argument passing
> and return. And I think e500 had unfoldable subregs of FP registers,
> although that port is gone now.
Could you share what is "unfoldable subreg"? I could not find this phrase
anywhere in the source, except in one comment in the i386 port.
Perhaps a subreg of a hardware register is "unfoldable" when the hardware
register is not valid in the outer mode? In which case the subreg cannot be
replaced directly with a hardware register?
>
> So I suppose the question is: when given a hard register, should
> validate_subreg test whether the subreg can be folded to a hard
> register? Or is it more relaxed than that? Do we need different
> rules before LRA (which could fix up subregs through reloading)
> and after LRA (where unfoldable subregs stay unfoldable).
My naive answer _was_ that validate_subreg should always perform checks for
hardware registers. Now I see it was too naive, because I was not aware of
the different ways targets use subregs. Hence this patch should be dropped.
Meanwhile PR119966, which this patch hoped to address, got fixed instead
with r16-809-gf725d6765373f7.
>
> If validate_subreg should test whether a subreg of a hard register
> can be folded to a hard register, the fix would be to use
> simplify_subreg_regno instead of the current tests. But it looks
> like that was deliberately not done.
When validate_subreg was introduced with r0-63800-gbeb72684810c1a,
simplify_subreg_regno simply did not exit. The simplify_subreg_regno
itself was added later with r0-89444-geef302d277ea42.
>
> It might still be worth trying to use simplify_subreg_regno and
> seeing what breaks. Any fallaout would at least let us expand
> the comments to explain the constraints.
I tried simplify_subreg_regno, and some tests regressed for x86_64-pc-linux-gnu:
check-gcc-c trunk patched change
-------------------------------+-------------------+------------+-----
# of expected passes 216431 216358 -73
# of unexpected failures 117 197 +80
# of unexpected successes 25 25 0
# of expected failures 1552 1552 0
# of unresolved testcases 0 33 +33
# of unsupported tests 3888 3888 0
For reference, here are a few of the newly failing tests:
FAIL: gcc.dg/asan/pr82517.c -O0 (internal compiler error: in gen_reg_rtx,
at emit-rtl.cc:1188)
FAIL: c-c++-common/cpp/embed-28.c -Wc++-compat (internal compiler error: in
gen_reg_rtx, at emit-rtl.cc:1188)
FAIL: gcc.dg/torture/stackalign/pr16660-3.c -Os -mforce-drap -fpic (internal
compiler error: in gen_reg_rtx, at emit-rtl.cc:1188)
The stack pointer register is explicitly rejected by simplify_subreg_regno.
So gen_lowpart fails to return a regno for 'sp' in config/i386/i386.md:28561:
;; Attempt to always use XOR for zeroing registers (including FP modes).
(define_peephole2
[(set (match_operand 0 "general_reg_operand")
(match_operand 1 "const0_operand"))]
"GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
&& (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
&& peep2_regno_dead_p (0, FLAGS_REG)"
[(parallel [(set (match_dup 0) (const_int 0))
(clobber (reg:CC FLAGS_REG))])]
"operands[0] = gen_lowpart (word_mode, operands[0]);")
peephole2 tries to generate the following, but gen_lowpart cannot create the
subreg:
(insn 29 8 12 2 (parallel [
(set (subreg:DI (reg/v:SI 7 sp [ a ]) 0)
(const_int 0 [0]))
(clobber (reg:CC 17 flags))
])
ICE backtrace:
.../gcc/gcc/testsuite/gcc.dg/asan/pr82517.c:34:1: internal compiler error: in
gen_reg_rtx, at emit-rtl.cc:1188
0x22d3d1f internal_error(char const*, ...)
/home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic-global-context.cc:517
0x669157 fancy_abort(char const*, int, char const*)
/home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic.cc:1815
0x46df3d gen_reg_rtx(machine_mode)
/home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:1188
0x946c80 copy_to_reg(rtx_def*)
/home/dinux/projects/pru/local-workspace/gcc/gcc/explow.cc:622
0xd5c64f gen_lowpart_general(machine_mode, rtx_def*)
/home/dinux/projects/pru/local-workspace/gcc/gcc/rtlhooks.cc:56
0x1998ca9 gen_split_318(rtx_insn*, rtx_def**)
/home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:12877
0x1d8a2b9 split_insns(rtx_def*, rtx_insn*)
/home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:20885
0x930f8b try_split(rtx_def*, rtx_insn*, int)
/home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:3952
0xd18ab5 split_insn
/home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3479
0xd1e877 split_all_insns()
/home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3583
0xd1e928 execute
/home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:4507
Thanks,
Dimitar
>
> Thanks,
> Richard