------- Comment #6 from rguenther at suse dot de 2010-02-01 09:23 ------- Subject: Re: [4.3/4.4/4.5 Regression] Argument unnecessarily spilled
On Mon, 1 Feb 2010, matz at gcc dot gnu dot org wrote: > ------- Comment #4 from matz at gcc dot gnu dot org 2010-02-01 02:41 ------- > Well, actually it depends. The code generated by 3.4 might theoretically > be slower, because it potentially uses a misaligned stack slot (incoming > stack with -m32 only 4 byte aligned). With -mpreferred-stack-boundary=2 also > newer compilers use the incoming stack slot instead of copying it around. > > When we determined we need a stack slot, the flow is like so: when the slot > alignment (32bits) is not known to be enough for it's declared type (64 bits > here), _and_ the preferred stack alignment (128 per default in new compilers) > is larger than the known slot alignment, then allocate a new stack slot. > > "allocate a new stack slot" is what differs between old and new compilers. > New compilers will simply, well, allocate a new slot :) Old compilers will > only use ADDRESSOF (if the type itself can otherwise be placed into > registers), a kind of deferred stack slot allocation to wait if the > address really needs to be taken (in new compilers this is always the case, > because we can rely on TREE_ADDRESSABLE). If it turns out to be necessary, > then it will reuse the stack slot, possibly misaligned (and the latter could > be regarded as bug). > > If the 3.4 compiler also would check for alignment in the new way (it only > did so for STRICT_ALIGNMENT targets), it too wouldn't have used the incoming > stack slot. > > This additional checking (not only for STRICT_ALIGNMENT targets) came in > as fix for PR18916 (that was after ADDRESSOF was removed already, otherwise > that fix would have affected that code too). > > So, I think, everything works as intended, as long as the alignment facts are > as they are: > * long long is 64 aligned > * incoming stack is 32 aligned > * preferred alignment is 128 (and that this matters seems fishy too) > > One might argue that this should only matter for STRICT_ALIGNMENT targets, > and therefore that ppc (ref PR18916) is such target. But that was altivec > code. And with such code (SSE) x86 also is sort of a STRICT_ALIGNMENT target. Hm, if we'd properly communicate that misalignment down to accesses it would for large structs or -Os still preferable to not do the extra spilling. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42919