The testcase
int foo(int bar)
{
int i, res = 0;
for (i=0; i<bar; ++i) {
int *x = __builtin_alloca(4);
res += *x;
}
return res;
}
uses a lot more memory than necessary because of __builtin_alloca
apparently returning 16-byte aligned stack space (-O2 -fomit-frame-pointer):
.L5:
subl $32, %esp #,
incl %edx # i
leal 15(%esp), %eax #, tmp64
andl $-16, %eax #, tmp64
addl (%eax), %ecx #, res
cmpl %edx, %ebx # i, bar
jg .L5 #,
I cannot find anything in the C standard about alloca, but alignment
bigger than the requested storage size should be not needed (though
this may be architecture dependent).
At least playing with alignment should be moved outside of the loop,
so we could save half of the wasted memory (I know the testcase is
dumb, but it's at least simple).
Btw. the same problem applies to 3.4, 3.3 doesn't play alignment
games but still allocates 16 bytes each time (that won't be aligned
to 16 bytes this way anyway?). 2.95 beats all of them again in
simplicity:
.L6:
addl $-16,%esp
addl (%esp),%eax
decl %edx
jnz .L6
--
Summary: alloca returning unnecessarily aligned pointer and uses
too much memory
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19131