On Fri, 2023-07-14 at 09:29 -0400, Vladimir Makarov wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
>
> On 7/13/23 05:27, SenthilKumar.Selvaraj--- via Gcc wrote:
> > Hi,
> >
> > I've been spending some (spare) time checking what it would take to
> > make LRA work for the avr target.
> >
> > Right after I removed the TARGET_LRA_P hook disabling LRA, building
> > libgcc failed with a weird ICE.
> > On the avr, the stack pointer (SP)
> > is not used to access stack slots
> It is very uncommon target then.
> > - TARGET_CAN_ELIMINATE returns false
> > if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true
> > if get_frame_size() > 0.
> >
> > With LRA, however, reload generates
> >
> > (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
> > (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86
> > {movqi_insn_split}
> > (nil))
> >
> > and the backend code errors out when it finds SP is being used as a
> > pointer register.
> >
> > Digging through the RTL dumps, I found the following. For the
> > following insn sequence in *.ira
> >
> > (insn 189 128 159 7 (set (reg:HI 58 [ b ])
> > (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
> > (nil))
> > (insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0)
> > (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split}
> > (nil))
> > (insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1)
> > (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split}
> > (nil))
> >
> > 1. For r58, IRA picks R28:R29, which is the frame pointer for avr.
> >
> > Popping a13(r58,l0) -- assign reg 28
> >
> > 2. LRA sees the subreg in insn 159 and generates a reload reg
> > (r125). simplify_subreg_regno (lra-constraints.cc:1810) however
> > bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and
> > reload isn't completed yet. LRA therefore decides rclass for the
> > pseudo reg is NO_REGS.
> >
> > <snip>
> > Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg
> > r125
> > 159: r125:HI#0=r86:QI
> >
> > 4. As rclass is NO_REGS, LRA picks an insn alternative that involves
> > memory.
> > That is my understanding, please correct me if I'm wrong.
> > <snip>
> > 0 Small class reload: reject+=3
> > 0 Non input pseudo reload: reject++
> > Cycle danger: overall += LRA_MAX_REJECT
> > alt=0,overall=610,losers=1,rld_nregs=1
> > 0 Small class reload: reject+=3
> > 0 Non input pseudo reload: reject++
> > alt=1: Bad operand -- refuse
> > 0 Non pseudo reload: reject++
> > alt=2,overall=1,losers=0,rld_nregs=0
> > Choosing alt 2 in insn 159: (0) Qm (1) rY00 {movqi_insn_split}
> >
> > 5. LRA creates stack slots, and then uses the FP register to access
> > the slots. This is despite r58 already being assigned R28:R29.
> >
> > 6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore
> > frame_pointer_needed is not set, despite the creation of stack
> > slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP,
> > and this eventually causes the ICE when the avr backend sees SP being
> > used as a pointer register.
> >
> > This is the relevant sequence after reload
> > <snip>
> > (insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
> > (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
> > (nil))
> > (insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
> > (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
> > (nil))
> > (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
> > (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86
> > {movqi_insn_split}
> > (nil))
> > (insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
> > (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101
> > {*movhi_split}
> > (nil))
> > (insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
> > (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
> > (nil))
> > (insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 2 [0x2])) [2 %sfp+2 S1 A8])
> > (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86
> > {movqi_insn_split}
> > (nil))
> > (insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
> > (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
> > (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101
> > {*movhi_split}
> > (nil))
> >
> > For choices other than FP, simplify_subreg_regno returns the correct part
> > of the wider HImode reg, so rclass is not NO_REGS, and things workout
> > fine.
> >
> > I checked what classic reload does in the same situation - it picks a
> > different register (R25) instead of spilling to a stack slot.
> >
> > <snip>
> > (insn 189 128 159 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
> > (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
> > (nil))
> > (insn 159 189 226 7 (set (reg:QI 25 r25)
> > (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86
> > {movqi_insn_split}
> > (nil))
> > (insn 226 159 160 7 (set (reg:QI 28 r28)
> > (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split}
> > (nil))
> > (insn 160 226 227 7 (set (reg:QI 25 r25)
> > (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86
> > {movqi_insn_split}
> > (nil))
> > (insn 227 160 33 7 (set (reg:QI 29 r29)
> > (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split}
> > (nil))
> >
> >
> > My questions:
> >
> > 1. Is there something obvious the avr backend is doing wrong that is
> > causing this?
> No. In my opinion if it worked with reload, it should work with LRA
> because LRA is a substitute for reload.
> > 2. Shouldn't LRA ask the backend for frame_pointer_required_p and
> > update frame_pointer_needed if it creates stack slots?
> I think so.
> > 3. Even if (2) works, I see that lra-eliminates.cc:update_reg_eliminate
> > asserts that if the backend said elimination to SP is
> > ok first up, it
> > cannot reject that elimination later (line 1165). If the only reason
> > FP is required is because LRA created stack
> > slots, what should the backend do?
> >
> I think it can be relaxed for avr on which sp can not be used access
> stack memory
> > 4. When simplify_subreg_regno bails for FP, lra-constraints.cc:1815
> > sets rclass = NO_REGS and forces a spill to memory. The comment says
> > it is to prevent infinite looping, but for this case, doesn't it
> > make sense to look for other regs?
> >
> I guess it can be relaxed for avr and frame-pointer too as avr does not
> use sp for accessing memory
> > 5. I can work around the problem by disabling elimination from FP to SP
> > when lra_in_progress, but I think it pevents IRA/LRA from using
> > R28:R29 even when FP is not required at all?
> Yes, it is better to avoid disabling elimination
> > 6. Basic question, but does FP to SP elimination mean any operation
> > possible with FP should be doable with SP as well?
>
> Hard to say. There are a lot of undocumented RA assumptions.
>
> If you send me the preprocessed test, I could start to work on it to fix
> the problems. I think it is hard to fix them right for a person having
> a little experience with LRA.
>
>
Ok, this is a reduced test case that reproduces the failure.
$ cat case.c
typedef int HItype __attribute__ ((mode (HI)));
HItype
__mulvhi3 (HItype a, HItype b)
{
HItype w;
if (__builtin_mul_overflow (a, b, &w))
__builtin_trap ();
return w;
}
On latest master, this trivial patch turns on LRA for avr
--- gcc/config/avr/avr.cc
+++ gcc/config/avr/avr.cc
@@ -15244,9 +15244,6 @@ avr_float_lib_compare_returns_bool (machine_mode mode,
enum rtx_code)
#undef TARGET_CONVERT_TO_TYPE
#define TARGET_CONVERT_TO_TYPE avr_convert_to_type
-#undef TARGET_LRA_P
-#define TARGET_LRA_P hook_bool_void_false
-
#undef TARGET_ADDR_SPACE_SUBSET_P
#define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p
Then configuring and building for avr without attempting to build libgcc
$ configure --target=avr --prefix=<prefix_dir> --enable-languages=c && make
all-host && make install-host
And finally to reproduce the failure
$ <prefix_dir>/bin/avr-gcc -mmcu=avr25 case.c -Os
Thanks for taking this up - please do let me know if you need more information.
Regards
Senthil