On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote: > Dimitar Dimitrov <dimi...@dinux.eu> writes: > > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the > > following RTL for pru-unknown-elf: > > > > (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856]) > > (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) > > (const_int 3 [0x3]))) > > (nil)) > > ... > > (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0) > > (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856]))) > > (nil)) > > > > into: > > > > (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0) > > (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0) > > (const_int 3 [0x3]))) > > (nil)) > > > > That caused libbacktrace build to break for pru-unknown-elf. Register > > r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to > > reject. > > > > Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer > > modes are valid for the hardware subreg. > > > > This patch fixes the broken PRU toolchain build. It leaves only two > > test case regressions for PRU, caused by rnreg pass renaming a valid > > paradoxical subreg into an invalid one. > > gcc.c-torture/execute/20040709-1.c > > gcc.c-torture/execute/20040709-2.c > > > > PR target/119966 > > > > gcc/ChangeLog: > > > > * emit-rtl.cc (validate_subreg): Validate inner > > and outer mode for paradoxical hardware subregs. > > > > Co-authored-by: Andrew Pinski <pins...@gmail.com> > > Signed-off-by: Dimitar Dimitrov <dimi...@dinux.eu> > > --- > > gcc/emit-rtl.cc | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc > > index e46b0f9eac4..6c5d9b55508 100644 > > --- a/gcc/emit-rtl.cc > > +++ b/gcc/emit-rtl.cc > > @@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode imode, > > if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) > > && GET_MODE_INNER (imode) == omode) > > ; > > + else if (!targetm.hard_regno_mode_ok (regno, imode) > > + || !targetm.hard_regno_mode_ok (regno, omode)) > > + return false; > > It isn't meaningful to test regno against omode, since that isn't > necessarily the register that would be produced by the subreg.
Do you refer to the register renaming pass? I can't think of another reason for the regno of a hardware register in a subreg to be changed. > > ISTR that this is a sensitive part of the codebase. I think there > are/were targets that create unfoldable subregs for argument passing > and return. And I think e500 had unfoldable subregs of FP registers, > although that port is gone now. Could you share what is "unfoldable subreg"? I could not find this phrase anywhere in the source, except in one comment in the i386 port. Perhaps a subreg of a hardware register is "unfoldable" when the hardware register is not valid in the outer mode? In which case the subreg cannot be replaced directly with a hardware register? > > So I suppose the question is: when given a hard register, should > validate_subreg test whether the subreg can be folded to a hard > register? Or is it more relaxed than that? Do we need different > rules before LRA (which could fix up subregs through reloading) > and after LRA (where unfoldable subregs stay unfoldable). My naive answer _was_ that validate_subreg should always perform checks for hardware registers. Now I see it was too naive, because I was not aware of the different ways targets use subregs. Hence this patch should be dropped. Meanwhile PR119966, which this patch hoped to address, got fixed instead with r16-809-gf725d6765373f7. > > If validate_subreg should test whether a subreg of a hard register > can be folded to a hard register, the fix would be to use > simplify_subreg_regno instead of the current tests. But it looks > like that was deliberately not done. When validate_subreg was introduced with r0-63800-gbeb72684810c1a, simplify_subreg_regno simply did not exit. The simplify_subreg_regno itself was added later with r0-89444-geef302d277ea42. > > It might still be worth trying to use simplify_subreg_regno and > seeing what breaks. Any fallaout would at least let us expand > the comments to explain the constraints. I tried simplify_subreg_regno, and some tests regressed for x86_64-pc-linux-gnu: check-gcc-c trunk patched change -------------------------------+-------------------+------------+----- # of expected passes 216431 216358 -73 # of unexpected failures 117 197 +80 # of unexpected successes 25 25 0 # of expected failures 1552 1552 0 # of unresolved testcases 0 33 +33 # of unsupported tests 3888 3888 0 For reference, here are a few of the newly failing tests: FAIL: gcc.dg/asan/pr82517.c -O0 (internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188) FAIL: c-c++-common/cpp/embed-28.c -Wc++-compat (internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188) FAIL: gcc.dg/torture/stackalign/pr16660-3.c -Os -mforce-drap -fpic (internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188) The stack pointer register is explicitly rejected by simplify_subreg_regno. So gen_lowpart fails to return a regno for 'sp' in config/i386/i386.md:28561: ;; Attempt to always use XOR for zeroing registers (including FP modes). (define_peephole2 [(set (match_operand 0 "general_reg_operand") (match_operand 1 "const0_operand"))] "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ()) && peep2_regno_dead_p (0, FLAGS_REG)" [(parallel [(set (match_dup 0) (const_int 0)) (clobber (reg:CC FLAGS_REG))])] "operands[0] = gen_lowpart (word_mode, operands[0]);") peephole2 tries to generate the following, but gen_lowpart cannot create the subreg: (insn 29 8 12 2 (parallel [ (set (subreg:DI (reg/v:SI 7 sp [ a ]) 0) (const_int 0 [0])) (clobber (reg:CC 17 flags)) ]) ICE backtrace: .../gcc/gcc/testsuite/gcc.dg/asan/pr82517.c:34:1: internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188 0x22d3d1f internal_error(char const*, ...) /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic-global-context.cc:517 0x669157 fancy_abort(char const*, int, char const*) /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic.cc:1815 0x46df3d gen_reg_rtx(machine_mode) /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:1188 0x946c80 copy_to_reg(rtx_def*) /home/dinux/projects/pru/local-workspace/gcc/gcc/explow.cc:622 0xd5c64f gen_lowpart_general(machine_mode, rtx_def*) /home/dinux/projects/pru/local-workspace/gcc/gcc/rtlhooks.cc:56 0x1998ca9 gen_split_318(rtx_insn*, rtx_def**) /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:12877 0x1d8a2b9 split_insns(rtx_def*, rtx_insn*) /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:20885 0x930f8b try_split(rtx_def*, rtx_insn*, int) /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:3952 0xd18ab5 split_insn /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3479 0xd1e877 split_all_insns() /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3583 0xd1e928 execute /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:4507 Thanks, Dimitar > > Thanks, > Richard