On Fri, Nov 18, 2011 at 1:24 PM, David Brown <da...@westcontrol.com> wrote: > On 18/11/2011 10:27, Alexandru Juncu wrote: >> >> Hello! >> >> I have a curiosity with something I once tested. I took a simple C >> program and made an assembly file with gcc -S. >> >> The C file looks something like this: >> int main(void) >> { >> int a=1, b=2; >> return 0; >> } >> >> The assembly instructions look like this: >> >> subl $16, %esp >> movl $1, -4(%ebp) >> movl $2, -8(%ebp) >> >> The subl $16, means the allocation of local variables on the stack, >> right? 16 bytes are enough for 4 32bit integers. >> If I have 1,2,3 or 4 local variables declared, you get those 16 bytes. >> If I have 5 variables, we have " subl $32, %esp". 5,6,7,8 >> variables ar >> the same. 9, 10,11,12, 48 bytes. >> >> The observation is that gcc allocates increments of 4 variables (if >> they are integers). If I allocate 8bit chars, increments of 16 chars. >> >> So the allocation is in increments of 16 bytes no matter what. >> >> OK, that's the observation... my question is why? What's the reason >> for this, is it an optimization (does is matter what's the -O used?) >> or is it architecture dependent (I ran it on x86) and is this just in >> gcc, just in a certain version of gcc or this is universal? >> >> Thank you! >> > > This is the wrong mailing list for questions like this - this is the list > for development of gcc itself, rather than for using it.
Thank you for still answering. I apologize, but I looked at the lists and this one seemed the most generic. Can you redirect me to another list where this thread would be appropriate? > > However, in answer to your question, the compiler will try to keep the stack > aligned in units of a suitable size for the processor architecture in use. > Typically, the processor will be most efficient if the stack is aligned > with cache lines. I don't know the details of the x86, but presumably > (level 1) cache lines are 16 bytes wide - or at least, that number fits > things like internal bus widths, prefetch buffers, etc. Thus the compiler > makes the tradeoff of using slightly more memory to improve the speed of the > program. I tried to compile with --param l1-cache-size... nothing seemed to change.