On 3/21/19 12:15 PM, Richard Biener wrote: > On Sun, 10 Mar 2019, Bernd Edlinger wrote: > Finally... > > Index: gcc/function.c > =================================================================== > --- gcc/function.c (revision 269264) > +++ gcc/function.c (working copy) > @@ -2210,6 +2210,12 @@ use_register_for_decl (const_tree decl) > if (DECL_MODE (decl) == BLKmode) > return false; > > + if (STRICT_ALIGNMENT && TREE_CODE (decl) == PARM_DECL > + && DECL_INCOMING_RTL (decl) && MEM_P (DECL_INCOMING_RTL (decl)) > + && GET_MODE_ALIGNMENT (DECL_MODE (decl)) > + > MEM_ALIGN (DECL_INCOMING_RTL (decl))) > + return false; > + > /* If -ffloat-store specified, don't put explicit float variables > into registers. */ > /* ??? This should be checked after DECL_ARTIFICIAL, but tree-ssa > > I wonder if it is necessary to look at DECL_INCOMING_RTL here > and why such RTL may not exist? That is, iff DECL_INCOMING_RTL > doesn't exist then shouldn't we return false for safety reasons? >
I think that happens a few times already before the INCOMING_RTL is assigned. I thought that might be too pessimistic. > Similarly the very same issue should exist on x86_64 which is > !STRICT_ALIGNMENT, it's just the ABI seems to provide the appropriate > alignment on the caller side. So the STRICT_ALIGNMENT check is > a wrong one. > I may be plain wrong here, but I thought that !STRICT_ALIGNMENT targets just use MEM_ALIGN to select the right instructions. MEM_ALIGN is always 32-bit align on the DImode memory. The x86_64 vector instructions would look at MEM_ALIGN and do the right thing, yes? It seems to be the definition of STRICT_ALIGNMENT targets that all RTL instructions need to have MEM_ALIGN >= GET_MODE_ALIGNMENT, so the target does not even have to look at MEM_ALIGN except in the mov_misalign_optab, right? The other hunk, where I admit I did not fully understand the comment, tries only to increase the MEM_ALIGN to 64-bit if the stack slot is 64-bit aligned although the target said it only needs 32-bit alignment. So that it is no longer necessary to copy the incoming value. > Which makes me think that a proper fix is not here, but in > target(hook) code. > > Changing use_register_for_decl sounds odd anyways since if we return true > we for the testcase still end up in memory, no? > It seems to make us use the incoming register _or_ stack slot if this function returns true here. If it returns false here, a new stack slot is allocated, but only if the original stack slot was not aligned. This works together with the other STRICT_ALIGNMENT check in assign_parm_adjust_stack_rtl. Where also for !STRICT_ALIGNMENT target TYPE_ALIGN and MEM_ALIGN are checked, but this seems to have only an effect if an address is taken, in that case I see use_register_for_decl return false due to TREE_ADDRESSABLE (decl), and whoops, we have an aligned copy of the unaligned stack slot. So I believe that there was already a fix for unaligned stack positions, that relied on the addressability of the parameter, while the target relied on the 8-byte alignment of the DImode access. > The hunk obviously misses a comment since the effect that this > will cause a copy to be emitted isn't obvious (and relying on > this probably fragile). > Yes, also that the copy is done using movmisalign optab is important. Thanks Bernd.