The stack should be aligned at a 8 mod 16 boundary at function entry, under both Apple's ABI and under "System V Application Binary Interface AMD64 Architecture Processor Supplement, draft 0.99".
GCC aligns the stack at a 0 mod 8 boundary. To quote from the above mentioned document: The end of the input argument area shall be aligned on a 16 byte boundary. In other words, the value (%rsp - 8) is always a multiple of 16 when control is transferred to the function entry point. How come this hasn't been discovered before (at least I cannot find any bug reports about it)? It is because the x86 is very lax about alignment. But a few instructions are not that lax, MOVDQA will trigger a SIGSEGV on *nix systems. It is a performance issue for other 16-byte loads and stores. I have no test case for this, although it is possible to trigger in a shared library under darwin, since the runtime loader used MOVDQA. Note that this bug has been verified to exist also for x86_64-*-freebsd, and from reading the compiler sources. it also affects gnu/linux. -- Summary: Stack not aligned at mod 16 byte boundary in x86_64 code Product: gcc Version: 4.2.2 Status: UNCONFIRMED Severity: major Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: tege-gcc at swox dot com GCC build triplet: i386-apple-darwin8.11.1 GCC host triplet: i386-apple-darwin8.11.1 GCC target triplet: i386-apple-darwin8.11.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35271