Sorry for the slow reply. Dimitar Dimitrov <dimi...@dinux.eu> writes: > On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote: >> Dimitar Dimitrov <dimi...@dinux.eu> writes: >> > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the >> > following RTL for pru-unknown-elf: >> > >> > (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856]) >> > (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) >> > (const_int 3 [0x3]))) >> > (nil)) >> > ... >> > (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0) >> > (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856]))) >> > (nil)) >> > >> > into: >> > >> > (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0) >> > (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0) >> > (const_int 3 [0x3]))) >> > (nil)) >> > >> > That caused libbacktrace build to break for pru-unknown-elf. Register >> > r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to >> > reject. >> > >> > Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer >> > modes are valid for the hardware subreg. >> > >> > This patch fixes the broken PRU toolchain build. It leaves only two >> > test case regressions for PRU, caused by rnreg pass renaming a valid >> > paradoxical subreg into an invalid one. >> > gcc.c-torture/execute/20040709-1.c >> > gcc.c-torture/execute/20040709-2.c >> > >> > PR target/119966 >> > >> > gcc/ChangeLog: >> > >> > * emit-rtl.cc (validate_subreg): Validate inner >> > and outer mode for paradoxical hardware subregs. >> > >> > Co-authored-by: Andrew Pinski <pins...@gmail.com> >> > Signed-off-by: Dimitar Dimitrov <dimi...@dinux.eu> >> > --- >> > gcc/emit-rtl.cc | 3 +++ >> > 1 file changed, 3 insertions(+) >> > >> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc >> > index e46b0f9eac4..6c5d9b55508 100644 >> > --- a/gcc/emit-rtl.cc >> > +++ b/gcc/emit-rtl.cc >> > @@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode >> > imode, >> > if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) >> > && GET_MODE_INNER (imode) == omode) >> > ; >> > + else if (!targetm.hard_regno_mode_ok (regno, imode) >> > + || !targetm.hard_regno_mode_ok (regno, omode)) >> > + return false; >> >> It isn't meaningful to test regno against omode, since that isn't >> necessarily the register that would be produced by the subreg. > > Do you refer to the register renaming pass? I can't think of another reason > for the regno of a hardware register in a subreg to be changed.
It's a general property of subregs. For example, suppose we have (reg:SI 0) on a 16-bit target. Then: (subreg:HI (reg:SI 0) 2) folds to (reg:HI 1) rather than (reg:HI 0). It's therefore register 1 rather than register 0 that should be tested against HImode. >> ISTR that this is a sensitive part of the codebase. I think there >> are/were targets that create unfoldable subregs for argument passing >> and return. And I think e500 had unfoldable subregs of FP registers, >> although that port is gone now. > > Could you share what is "unfoldable subreg"? I could not find this phrase > anywhere in the source, except in one comment in the i386 port. Yeah, it was an ad-hoc term, sorry, but... > Perhaps a subreg of a hardware register is "unfoldable" when the hardware > register is not valid in the outer mode? In which case the subreg cannot be > replaced directly with a hardware register? ...yes, that's what I meant. >> So I suppose the question is: when given a hard register, should >> validate_subreg test whether the subreg can be folded to a hard >> register? Or is it more relaxed than that? Do we need different >> rules before LRA (which could fix up subregs through reloading) >> and after LRA (where unfoldable subregs stay unfoldable). > > My naive answer _was_ that validate_subreg should always perform checks for > hardware registers. Now I see it was too naive, because I was not aware of > the different ways targets use subregs. Hence this patch should be dropped. > > Meanwhile PR119966, which this patch hoped to address, got fixed instead > with r16-809-gf725d6765373f7. Ah, nice! >> If validate_subreg should test whether a subreg of a hard register >> can be folded to a hard register, the fix would be to use >> simplify_subreg_regno instead of the current tests. But it looks >> like that was deliberately not done. > > When validate_subreg was introduced with r0-63800-gbeb72684810c1a, > simplify_subreg_regno simply did not exit. The simplify_subreg_regno > itself was added later with r0-89444-geef302d277ea42. Ah, yeah, my bad. I'd forgotten the history of these routines. >> It might still be worth trying to use simplify_subreg_regno and >> seeing what breaks. Any fallaout would at least let us expand >> the comments to explain the constraints. > > I tried simplify_subreg_regno, and some tests regressed for > x86_64-pc-linux-gnu: > > check-gcc-c trunk patched change > -------------------------------+-------------------+------------+----- > # of expected passes 216431 216358 -73 > # of unexpected failures 117 197 +80 > # of unexpected successes 25 25 0 > # of expected failures 1552 1552 0 > # of unresolved testcases 0 33 +33 > # of unsupported tests 3888 3888 0 > > For reference, here are a few of the newly failing tests: > FAIL: gcc.dg/asan/pr82517.c -O0 (internal compiler error: in gen_reg_rtx, > at emit-rtl.cc:1188) > FAIL: c-c++-common/cpp/embed-28.c -Wc++-compat (internal compiler error: > in gen_reg_rtx, at emit-rtl.cc:1188) > FAIL: gcc.dg/torture/stackalign/pr16660-3.c -Os -mforce-drap -fpic > (internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188) > > The stack pointer register is explicitly rejected by simplify_subreg_regno. > So gen_lowpart fails to return a regno for 'sp' in config/i386/i386.md:28561: > > ;; Attempt to always use XOR for zeroing registers (including FP modes). > (define_peephole2 > [(set (match_operand 0 "general_reg_operand") > (match_operand 1 "const0_operand"))] > "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD > && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ()) > && peep2_regno_dead_p (0, FLAGS_REG)" > [(parallel [(set (match_dup 0) (const_int 0)) > (clobber (reg:CC FLAGS_REG))])] > "operands[0] = gen_lowpart (word_mode, operands[0]);") > > peephole2 tries to generate the following, but gen_lowpart cannot create the > subreg: > > (insn 29 8 12 2 (parallel [ > (set (subreg:DI (reg/v:SI 7 sp [ a ]) 0) > (const_int 0 [0])) > (clobber (reg:CC 17 flags)) > ]) > > ICE backtrace: > .../gcc/gcc/testsuite/gcc.dg/asan/pr82517.c:34:1: internal compiler error: > in gen_reg_rtx, at emit-rtl.cc:1188 > 0x22d3d1f internal_error(char const*, ...) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic-global-context.cc:517 > 0x669157 fancy_abort(char const*, int, char const*) > /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic.cc:1815 > 0x46df3d gen_reg_rtx(machine_mode) > /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:1188 > 0x946c80 copy_to_reg(rtx_def*) > /home/dinux/projects/pru/local-workspace/gcc/gcc/explow.cc:622 > 0xd5c64f gen_lowpart_general(machine_mode, rtx_def*) > /home/dinux/projects/pru/local-workspace/gcc/gcc/rtlhooks.cc:56 > 0x1998ca9 gen_split_318(rtx_insn*, rtx_def**) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:12877 > 0x1d8a2b9 split_insns(rtx_def*, rtx_insn*) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:20885 > 0x930f8b try_split(rtx_def*, rtx_insn*, int) > /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:3952 > 0xd18ab5 split_insn > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3479 > 0xd1e877 split_all_insns() > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3583 > 0xd1e928 execute > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:4507 As an experiment, it would be interesting to see whether subregs of the stack, frame, and argument pointers are the only exceptions, in which case we could add a special case. But if the PR has already been fixed, then I supose there's not much incentive to rework such a sensitive piece of code. :) Thanks, Richard