Dimitar Dimitrov <dimi...@dinux.eu> writes: > On Thu, May 29, 2025 at 05:59:44PM +0100, Richard Sandiford wrote: >> Sorry for the slow reply. >> >> Dimitar Dimitrov <dimi...@dinux.eu> writes: >> > On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote: >> >> Dimitar Dimitrov <dimi...@dinux.eu> writes: > ... >> >> It might still be worth trying to use simplify_subreg_regno and >> >> seeing what breaks. Any fallaout would at least let us expand >> >> the comments to explain the constraints. >> > >> > I tried simplify_subreg_regno, and some tests regressed for >> > x86_64-pc-linux-gnu: >> > >> > check-gcc-c trunk patched change >> > -------------------------------+-------------------+------------+----- >> > # of expected passes 216431 216358 -73 >> > # of unexpected failures 117 197 +80 >> > # of unexpected successes 25 25 0 >> > # of expected failures 1552 1552 0 >> > # of unresolved testcases 0 33 +33 >> > # of unsupported tests 3888 3888 0 >> > >> > For reference, here are a few of the newly failing tests: >> > FAIL: gcc.dg/asan/pr82517.c -O0 (internal compiler error: in >> > gen_reg_rtx, at emit-rtl.cc:1188) >> > FAIL: c-c++-common/cpp/embed-28.c -Wc++-compat (internal compiler >> > error: in gen_reg_rtx, at emit-rtl.cc:1188) >> > FAIL: gcc.dg/torture/stackalign/pr16660-3.c -Os -mforce-drap -fpic >> > (internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188) >> > >> > The stack pointer register is explicitly rejected by simplify_subreg_regno. >> > So gen_lowpart fails to return a regno for 'sp' in >> > config/i386/i386.md:28561: >> > >> > ;; Attempt to always use XOR for zeroing registers (including FP modes). >> > (define_peephole2 >> > [(set (match_operand 0 "general_reg_operand") >> > (match_operand 1 "const0_operand"))] >> > "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD >> > && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ()) >> > && peep2_regno_dead_p (0, FLAGS_REG)" >> > [(parallel [(set (match_dup 0) (const_int 0)) >> > (clobber (reg:CC FLAGS_REG))])] >> > "operands[0] = gen_lowpart (word_mode, operands[0]);") >> > >> > peephole2 tries to generate the following, but gen_lowpart cannot create >> > the >> > subreg: >> > >> > (insn 29 8 12 2 (parallel [ >> > (set (subreg:DI (reg/v:SI 7 sp [ a ]) 0) >> > (const_int 0 [0])) >> > (clobber (reg:CC 17 flags)) >> > ]) >> > >> > ICE backtrace: >> > .../gcc/gcc/testsuite/gcc.dg/asan/pr82517.c:34:1: internal compiler >> > error: in gen_reg_rtx, at emit-rtl.cc:1188 >> > 0x22d3d1f internal_error(char const*, ...) >> > >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic-global-context.cc:517 >> > 0x669157 fancy_abort(char const*, int, char const*) >> > >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic.cc:1815 >> > 0x46df3d gen_reg_rtx(machine_mode) >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:1188 >> > 0x946c80 copy_to_reg(rtx_def*) >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/explow.cc:622 >> > 0xd5c64f gen_lowpart_general(machine_mode, rtx_def*) >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/rtlhooks.cc:56 >> > 0x1998ca9 gen_split_318(rtx_insn*, rtx_def**) >> > >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:12877 >> > 0x1d8a2b9 split_insns(rtx_def*, rtx_insn*) >> > >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:20885 >> > 0x930f8b try_split(rtx_def*, rtx_insn*, int) >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:3952 >> > 0xd18ab5 split_insn >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3479 >> > 0xd1e877 split_all_insns() >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3583 >> > 0xd1e928 execute >> > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:4507 >> >> As an experiment, it would be interesting to see whether subregs of >> the stack, frame, and argument pointers are the only exceptions, >> in which case we could add a special case. > > Yes, with added exceptions for those registers, x86_64-pc-linux-gnu > has no regressions (see attached patch).
Ah, nice. > Sorry, I cannot provide test results for other primary targets. I'm > struggling to setup a cross-linux toolchain build and qemu dejagnu testing. > >> >> But if the PR has already been fixed, then I supose there's not much >> incentive to rework such a sensitive piece of code. :) > > I admit, CI going back to green for pru-unknown-elf certainly removed > the pressure :) It's up to you. I think it would be worth pushing the patch (with the comments below) as an experiment, with the understanding that we'll back it out if it turns out to cause too much breakage. Like I say, we'd at least then gather information about what the constraints are. > Andrew Pinski commented in PR119966 that the rootcause was probably not > fixed. Issue instead became latent again. But right now I don't seem > to have the expertise needed to rework validate_subreg. > > Thanks, > Dimitar > > >> >> Thanks, >> Richard > > From ea23fe8e509131c86149593464cfc7a8e69db7cb Mon Sep 17 00:00:00 2001 > From: Dimitar Dimitrov <dimi...@dinux.eu> > Date: Mon, 2 Jun 2025 08:10:16 +0300 > Subject: [RFC PATCH] emit-rtl: Use simplify_subreg_regno to validate hardware > subregs [PR119966] > To: gcc-patches@gcc.gnu.org > > The simplify_subreg_regno performs more validity checks than > the simple info.representable_p, most importantly with the > calls to targetm.hard_regno_mode_ok. > > The checks for stack-related registers is bypassed because the i386 > backend generates them, in a seemingly valid peephole optimization. > > This is just an experiment to investigate validate_subreg. > > No regressions were detected for C and C++ on x86_64-pc-linux-gnu. > > Signed-off-by: Dimitar Dimitrov <dimi...@dinux.eu> > --- > gcc/emit-rtl.cc | 2 +- > gcc/rtl.h | 3 ++- > gcc/rtlanal.cc | 30 +++++++++++++++++------------- > 3 files changed, 20 insertions(+), 15 deletions(-) > > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc > index 3f453cda67e..acffa5aacc1 100644 > --- a/gcc/emit-rtl.cc > +++ b/gcc/emit-rtl.cc > @@ -987,7 +987,7 @@ validate_subreg (machine_mode omode, machine_mode imode, > else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode)) > return false; > > - return subreg_offset_representable_p (regno, imode, offset, omode); > + return simplify_subreg_regno (regno, imode, offset, omode, true) >= 0; I think we should add a comment why: /* ??? Pass true to allow_stack_regs because targets like x86 expect to be able to take subregs of the stack pointer. */ > } > /* Do not allow normal SUBREG with stricter alignment than the inner MEM. > > diff --git a/gcc/rtl.h b/gcc/rtl.h > index 7049ed775e8..803fd9db49d 100644 > --- a/gcc/rtl.h > +++ b/gcc/rtl.h > @@ -2498,7 +2498,8 @@ extern bool subreg_offset_representable_p (unsigned > int, machine_mode, > poly_uint64, machine_mode); > extern unsigned int subreg_regno (const_rtx); > extern int simplify_subreg_regno (unsigned int, machine_mode, > - poly_uint64, machine_mode); > + poly_uint64, machine_mode, > + bool allow_stack_regs = false); > extern int lowpart_subreg_regno (unsigned int, machine_mode, > machine_mode); > extern unsigned int subreg_nregs (const_rtx); > diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc > index 86a5e473308..e7ff415cc41 100644 > --- a/gcc/rtlanal.cc > +++ b/gcc/rtlanal.cc > @@ -4264,7 +4264,8 @@ subreg_offset_representable_p (unsigned int xregno, > machine_mode xmode, The new parameter should be documented. Maybe change: XREGNO is a hard register number. */ to: XREGNO is a hard register number. ALLOW_STACK_REGS is true if we should allow subregs of stack_pointer_rtx, frame_pointer_rtx. and arg_pointer_rtx (which are normally expected to be the unique way of referring to their respective registers). */ OK with those changes if you want to try it. Thanks, Richard > > int > simplify_subreg_regno (unsigned int xregno, machine_mode xmode, > - poly_uint64 offset, machine_mode ymode) > + poly_uint64 offset, machine_mode ymode, > + bool allow_stack_regs) > { > struct subreg_info info; > unsigned int yregno; > @@ -4275,20 +4276,23 @@ simplify_subreg_regno (unsigned int xregno, > machine_mode xmode, > && !REG_CAN_CHANGE_MODE_P (xregno, xmode, ymode)) > return -1; > > - /* We shouldn't simplify stack-related registers. */ > - if ((!reload_completed || frame_pointer_needed) > - && xregno == FRAME_POINTER_REGNUM) > - return -1; > + if (!allow_stack_regs) > + { > + /* We shouldn't simplify stack-related registers. */ > + if ((!reload_completed || frame_pointer_needed) > + && xregno == FRAME_POINTER_REGNUM) > + return -1; > > - if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM > - && xregno == ARG_POINTER_REGNUM) > - return -1; > + if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM > + && xregno == ARG_POINTER_REGNUM) > + return -1; > > - if (xregno == STACK_POINTER_REGNUM > - /* We should convert hard stack register in LRA if it is > - possible. */ > - && ! lra_in_progress) > - return -1; > + if (xregno == STACK_POINTER_REGNUM > + /* We should convert hard stack register in LRA if it is > + possible. */ > + && ! lra_in_progress) > + return -1; > + } > > /* Try to get the register offset. */ > subreg_get_info (xregno, xmode, offset, ymode, &info);