http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50883
Richard Earnshaw <rearnsha at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|WAITING |NEW CC| |rearnsha at gcc dot gnu.org --- Comment #6 from Richard Earnshaw <rearnsha at gcc dot gnu.org> 2011-10-27 16:53:07 UTC --- So for PPC, this seems to be a lucky side-effect of the way the PPC ABI is defined. On PPC the argument is passed by reference and the compiler generates an initial load expression of X from memory (a callee copy). The RTL optimization passes are then (in combination with the subreg splitting code) able to notice, somehow, that this is equivalent to the final result and to make use of it. On ARM the structure is passed by value and the argument expanding code immediately tries to store a copy into the stack. Then, rather than seeing that this could be used from a register value it tries to work solely with the stack copy. I notice that neither architecture optimizes this if the struct is passed as an address rather than by value.