RE: [RFC] Further LRA subreg handling issues

Matthew Fortune Thu, 26 Jan 2017 09:19:19 -0800

Eric Botcazou <ebotca...@adacore.com> writes:
> > However in lra-constraints.c:simplify_operand_subreg it quite happily
> > performs a reload using the outer mode in this case and only drops
> > down to the inner mode if the outer mode reload would be slower than
> the inner.
> >
> > Presumably this is safe for non WORD_REGISTER_OPERATIONS targets as
> > the junk upper bits in registers will be ignored; On
> > WORD_REGISTER_OPERATIONS targets then the narrower-than-word mode load
> > will take care of any 'magic' needed to set the upper bits to a safe
> value in register.
> 
> Yes, I was leaning to the same conclusion before reading your second
> message.
> 
> > So my thinking is that at least WORD_REGISTER_OPERATIONS targets
> > should always reload the inner mode for the case mentioned above much
> > like the same is required for normal subregs. Does that seem
> > reasonable? Have I misunderstood the paradoxical subreg case entirely?
> 
> No, this is correct, see find_reloads:
> 
>             /* We must force a reload of paradoxical SUBREGs
>                of a MEM because the alignment of the inner value
>                may not be enough to do the outer reference.  On
>                big-endian machines, it may also reference outside
>                the object.
> 
>                On machines that extend byte operations and we have a
>                SUBREG where both the inner and outer modes are no wider
>                than a word and the inner mode is narrower, is integral,
>                and gets extended when loaded from memory, combine.c has
>                made assumptions about the behavior of the machine in such
>                register access.  If the data is, in fact, in memory we
>                must always load using the size assumed to be in the
>                register and let the insn do the different-sized
>                accesses.


This part suggests to me that LRA should never be reloading the
paradoxical subreg meaning the whole SLOW_UNALIGNED_ACCESS checking code in
simplify_operand_subreg could be removed unconditionally.  But I get the
feeling the big valid_address_p check (below) will still prevent some
paradoxical subregs from being reloaded via their inner mode.  I haven't
quite understood exactly what the check is trying to achieve yet though:

      if (!addr_was_valid
          || valid_address_p (GET_MODE (subst), XEXP (subst, 0),
                              MEM_ADDR_SPACE (subst))
          || ((get_constraint_type (lookup_constraint
                                    (curr_static_id->operand[nop].constraint))
               != CT_SPECIAL_MEMORY)
              /* We still can reload address and if the address is
                 valid, we can remove subreg without reloading its
                 inner memory.  */
              && valid_address_p (GET_MODE (subst),
                                  regno_reg_rtx
                                  [ira_class_hard_regs
                                   [base_reg_class (GET_MODE (subst),
                                                    MEM_ADDR_SPACE (subst),
                                                    ADDRESS, SCRATCH)][0]],
                                  MEM_ADDR_SPACE (subst))))
        {

>                This is doubly true if WORD_REGISTER_OPERATIONS.  In
>                this case eliminate_regs has left non-paradoxical
>                subregs for push_reload to see.  Make sure it does
>                by forcing the reload.

This statement covers the fix I already proposed but perhaps
simplify_operand_subreg can also hit this issue if a 'normal' subreg appears
in an instruction where registers and memory are supported (like move
instructions). In this case the constraints are satisfied and the fix I
proposed would never get run but simplify_operand_subreg would.

Eric: I see you recently had to modify the code I'm talking about in the post
below. Out of interest... was this another issue brought to light by the
improvements to zero extension elimination?

https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01202.html

Matthew

RE: [RFC] Further LRA subreg handling issues

Reply via email to