[Bug target/27537] XMM alignment fault when compiling for i386 with -Os

agner at agner dot org Wed, 23 Aug 2006 01:04:36 -0700


------- Comment #11 from agner at agner dot org  2006-08-23 08:04 -------
This problem wouldn't have happened if the ABI had been better maintained.
Somebody decides to change the calling convention without properly documenting
the change, and somebody else makes another change that is incompatible because
the alignment requirement has never made it into the ABI documents.


Let me help you making a decision on this issue by summarizing the pro's and
con's of 16-bytes stack alignment in 32-bit x86 Linux/BSD.

Advantages of enforcing 16-bytes stack alignment:
-------------------------------------------------
1.
The use of XMM code is becoming more common now that all new computers have
support for the SSE2 or higher instructions set. The necessary alignment of XMM
variables can be implemented more efficiently when the stack is aligned.

2.
Variables of type double (double precision floating point) are accessed more
efficiently when aligned by 8. This is easily achieved when the stack is
aligned.

3.
Function parameters of type double will automatically get the optimal
alignment, unless the parameter is preceded by an odd number of smaller
parameters (including any 'this' pointer and return pointer). This means that
more than 50% of function parameters of type double will be optimally aligned,
versus 50% without stack alignment. The C/C++ programmer will be able to ensure
optimal alignment by manipulating the order of function parameters.

4.
Functions that need to align local variables can do so without using EBP as
stack frame. This frees EBP for other purposes. General purpose registers is a
scarce resource in 32-bit mode.

5.
16-bytes stack alignment is officially enforced in Intel-based Mac OS X. It is
desirable to have identical ABI's for Linux, BSD and Mac. This makes it
possible to use the same compilers and the same function libraries for all
three platforms (except for the object file format, which can be converted).

6.
The stack alignment requires no extra instructions in leaf functions, which are
more likely to contain the critical innermost loop than non-leaf functions.

7.
The stack alignment requires no extra instructions in a non-leaf function if
the function adjusts the stack pointer anyway for the sake of allocating local
storage.

8.
Stack alignment is already implemented in Gcc and existing code relies on it.


Disadvantages of enforcing 16-bytes stack alignment:
----------------------------------------------------
1.
A non-leaf function without any stack space allocated for local storage needs
one or two extra instructions for conforming to the stack alignment
requirement.

2.
The alignment requirement results in unused space in the stack. This takes up
to 12 bytes of extra space in the data cache for each function calling level
except the innermost. Assuming that only the innermost three function levels
matter in terms of speed, and that the number of unused bytes is 8 on average
for all but the innermost function, the total amount of space wasted in the
data cache is 16 bytes.

3.
The Intel compiler does not enforce stack alignment. However, the Intel people
are ready to change this as soon as you Gnu people make a decision on this
issue. I have contact with the Intel people about this issue.

4.
Stack alignment is not enforced in 32-bit Windows. Compatibility with the
Windows ABI might be desirable.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27537

[Bug target/27537] XMM alignment fault when compiling for i386 with -Os

Reply via email to