The testcase int foo(int bar) { int i, res = 0; for (i=0; i<bar; ++i) { int *x = __builtin_alloca(4); res += *x; } return res; }
uses a lot more memory than necessary because of __builtin_alloca apparently returning 16-byte aligned stack space (-O2 -fomit-frame-pointer): .L5: subl $32, %esp #, incl %edx # i leal 15(%esp), %eax #, tmp64 andl $-16, %eax #, tmp64 addl (%eax), %ecx #, res cmpl %edx, %ebx # i, bar jg .L5 #, I cannot find anything in the C standard about alloca, but alignment bigger than the requested storage size should be not needed (though this may be architecture dependent). At least playing with alignment should be moved outside of the loop, so we could save half of the wasted memory (I know the testcase is dumb, but it's at least simple). Btw. the same problem applies to 3.4, 3.3 doesn't play alignment games but still allocates 16 bytes each time (that won't be aligned to 16 bytes this way anyway?). 2.95 beats all of them again in simplicity: .L6: addl $-16,%esp addl (%esp),%eax decl %edx jnz .L6 -- Summary: alloca returning unnecessarily aligned pointer and uses too much memory Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at tat dot physik dot uni-tuebingen dot de CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19131