Eric Botcazou <ebotca...@adacore.com> writes: > > However in lra-constraints.c:simplify_operand_subreg it quite happily > > performs a reload using the outer mode in this case and only drops > > down to the inner mode if the outer mode reload would be slower than > the inner. > > > > Presumably this is safe for non WORD_REGISTER_OPERATIONS targets as > > the junk upper bits in registers will be ignored; On > > WORD_REGISTER_OPERATIONS targets then the narrower-than-word mode load > > will take care of any 'magic' needed to set the upper bits to a safe > value in register. > > Yes, I was leaning to the same conclusion before reading your second > message. > > > So my thinking is that at least WORD_REGISTER_OPERATIONS targets > > should always reload the inner mode for the case mentioned above much > > like the same is required for normal subregs. Does that seem > > reasonable? Have I misunderstood the paradoxical subreg case entirely? > > No, this is correct, see find_reloads: > > /* We must force a reload of paradoxical SUBREGs > of a MEM because the alignment of the inner value > may not be enough to do the outer reference. On > big-endian machines, it may also reference outside > the object. > > On machines that extend byte operations and we have a > SUBREG where both the inner and outer modes are no wider > than a word and the inner mode is narrower, is integral, > and gets extended when loaded from memory, combine.c has > made assumptions about the behavior of the machine in such > register access. If the data is, in fact, in memory we > must always load using the size assumed to be in the > register and let the insn do the different-sized > accesses.
This part suggests to me that LRA should never be reloading the paradoxical subreg meaning the whole SLOW_UNALIGNED_ACCESS checking code in simplify_operand_subreg could be removed unconditionally. But I get the feeling the big valid_address_p check (below) will still prevent some paradoxical subregs from being reloaded via their inner mode. I haven't quite understood exactly what the check is trying to achieve yet though: if (!addr_was_valid || valid_address_p (GET_MODE (subst), XEXP (subst, 0), MEM_ADDR_SPACE (subst)) || ((get_constraint_type (lookup_constraint (curr_static_id->operand[nop].constraint)) != CT_SPECIAL_MEMORY) /* We still can reload address and if the address is valid, we can remove subreg without reloading its inner memory. */ && valid_address_p (GET_MODE (subst), regno_reg_rtx [ira_class_hard_regs [base_reg_class (GET_MODE (subst), MEM_ADDR_SPACE (subst), ADDRESS, SCRATCH)][0]], MEM_ADDR_SPACE (subst)))) { > This is doubly true if WORD_REGISTER_OPERATIONS. In > this case eliminate_regs has left non-paradoxical > subregs for push_reload to see. Make sure it does > by forcing the reload. This statement covers the fix I already proposed but perhaps simplify_operand_subreg can also hit this issue if a 'normal' subreg appears in an instruction where registers and memory are supported (like move instructions). In this case the constraints are satisfied and the fix I proposed would never get run but simplify_operand_subreg would. Eric: I see you recently had to modify the code I'm talking about in the post below. Out of interest... was this another issue brought to light by the improvements to zero extension elimination? https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01202.html Matthew