On i?86, Linux kernel (or e.g. valgrind) are compiled with -Os -m32
-mpreferred-stack-boundary=2. AFAIK this is used primarily to make generated
code small. But when compiled with gcc 4.4, lots of functions at least in
valgrind (haven't checked kernel, but I assume even more so there) now newly
uses dynamic realignment. Short testcase:
void foo (unsigned long long *);
int bar (void)
{
unsigned long long l;
foo (&l);
return 0;
}
The problem is that without -malign-double, long long and double have 32-bit
alignment only in aggregate fields, when used standalone have for performance
reasons 64-bit alignment. When used in .data/.rodata etc. this is not a
problem, but when DImode/DFmode vars are automatic, this means
crtl->stack_alignment_estimated is 64 and so stack is dynamically realigned,
which increases generated code size quite a bit and performance wise slows
stuff down as well.
Could we not count DFmode/DImode vars into crtl->stack_alignment_estimated
on i386 using some target macro, or at least not count them conditionally (on
-Os + -mpreferred-stack-boundary=2, or perhaps based on some other option)?
I'm pretty sure the kernel people will be very unhappy about this.
--
Summary: [4.4 Regression] -mpreferred-stack-boundary=2 causes
lots of dynamic realign
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jakub at gcc dot gnu dot org
GCC target triplet: i?86-*-linux*
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39137