Re: [RFC] Further LRA subreg handling issues

David Malcolm Thu, 26 Jan 2017 08:22:36 -0800

On Thu, 2017-01-26 at 13:00 +0000, Matthew Fortune wrote:
> Matthew Fortune <matthew.fort...@imgtec.com> writes:
> ...
> > Pseudo 300 is assigned to memory and then LRA produces a simple
> > DImode
> > load from the assigned stack slot. The only instruction to set
> > pseudo
> > 300 is:
> > 
> > (insn 247 212 389 3 (set (reg:SI 300)
> >         (ne:SI (subreg/s/u:SI (reg/v:DI 231 [ taken ]) 0)
> >             (const_int 0 [0])))
> > "/home/mfortune/gcc/gcc/predict.c":2904
> > 504 {*sne_zero_sisi}
> >      (nil))
> > 
> > Which leads to an SImode store to the stack slot:
> > 
> > (insn 247 392 393 3 (set (reg:SI 4 $4 [300])
> >         (ne:SI (reg:SI 20 $20 [orig:231 taken ] [231])
> >             (const_int 0 [0])))
> > "/home/mfortune/gcc/gcc/predict.c":2904
> > 504 {*sne_zero_sisi}
> >      (nil))
> > (insn 393 247 389 3 (set (mem/c:SI (plus:DI (reg/f:DI 29 $sp)
> >                 (const_int 16 [0x10])) [403 %sfp+16 S4 A64])
> >         (reg:SI 4 $4 [300]))
> > "/home/mfortune/gcc/gcc/predict.c":2904 312
> > {*movsi_internal}
> >      (nil))
> > ...
> > 
> > (note 248 246 249 40 NOTE_INSN_DELETED)
> > (note 249 248 256 40 NOTE_INSN_DELETED)
> > (note 256 249 250 40 NOTE_INSN_DELETED)
> > (insn 250 256 251 40 (set (reg:DI 6 $6)
> >         (mem/c:DI (plus:DI (reg/f:DI 29 $sp)
> >                 (const_int 16 [0x10])) [403 %sfp+16 S8 A64]))
> > "/home/mfortune/gcc/gcc/predict.c":2904 310 {*movdi_64bit}
> >      (nil))
> > 
> > My assumption is that LRA is again expected to deal with this case
> > and
> > for insn
> > 250 should be recognising that it must load 32-bits and rely on
> > implicit
> > LOAD_EXTEND_OP behaviour producing an acceptable 64-bit value. In
> > this
> > case it does not matter whether it is sign or zero extension and my
> > assumption is that this construct would never appear if a specific
> > sign
> > or zero extension was required.
> > 
> > I haven't got to looking at where the issue is this time but it
> > seems
> > different as this is a subreg in a simple move instruction where we
> > already support the load/ store directly so no new reload
> > instruction is
> > required. I don't know if this implies that simple move patterns
> > should
> > reject subregs but that doesn't sound right either.
> > 
> > Resolving this fixes at least one bug and potentially all bugs in
> > the
> > MIPS bootstrap as  manually modified the generated assembly code to
> > use
> > LW instead of LD for insn
> > 250 and one of the buggy stage 3 objects is fixed.
> > 
> > I'll keep thinking, any advice in the meantime is appreciated.
> 
> All I have been able to determine on this is that there is
> potentially
> different behaviour for paradoxical subregs in LRA vs reload.  There
> is
> this comment in reload.c:push_reload:
> 
>     If we have (SUBREG:M1 (MEM:M2 ...) ...) (or an inner REG that is
> still
>      a pseudo and hence will become a MEM) with M1 wider than M2 and
> the
>      register is a pseudo, also reload the inside expression.
> 
> To me this makes perfect sense as I believe the RTL is only saying
> that
> there is an M2-mode object to access or at least only the M2-mode
> sized
> bits are valid. There are comments to say there will always be
> sufficient
> memory assigned for spill slots as they are sized to fit the largest
> paradoxical subreg, I just don't know why that is useful/important.
> 
> However in lra-constraints.c:simplify_operand_subreg it quite happily
> performs a reload using the outer mode in this case and only drops
> down to
> the inner mode if the outer mode reload would be slower than the
> inner.
> 
> Presumably this is safe for non WORD_REGISTER_OPERATIONS targets as
> the
> junk upper bits in registers will be ignored; On
> WORD_REGISTER_OPERATIONS
> targets then the narrower-than-word mode load will take care of any
> 'magic' needed to set the upper bits to a safe value in register.
> 
> So my thinking is that at least WORD_REGISTER_OPERATIONS targets
> should
> always reload the inner mode for the case mentioned above much like
> the same
> is required for normal subregs. Does that seem reasonable? Have I
> misunderstood the paradoxical subreg case entirely?
> 
> I've only done superficial testing of a change to this code so far
> but my
> testcase starts working at least which is a start.


FWIW, the RTL "frontend" [1] is now in trunk (as of r244878), so it
should now possible to write small fragments of RTL as testcases in
DejaGnu.

I don't know if it's helpful for this bug though.

In case it is, I started some documentation for it here:
  https://gcc.gnu.org/ml/gcc-patches/2017-01/msg02065.html


Dave

[1] I put "frontend" in quotes as it's actually an extension to the C
frontend, rather than a true frontend.

Re: [RFC] Further LRA subreg handling issues

Reply via email to