http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45835
Summary: Consider push simm8;pop reg for -Os
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected], [email protected]
Target: x86_64-linux
http://embed.cs.utah.edu/embarrassing/jan_10/ snippets suggest that for -Os
(not sure if just for -m32 or even -m64) icc generates shorter sequences for
loading signed 8 bit immediates into registers.
movl $1, %eax is 5 bytes long, while pushl $1; popl %eax is 3 byte long for
-m32
(and similarly pushq $1; popq %rax for -m64). For r8..r15 push/pop is 4 bytes,
while movl is 6 bytes.
Not sure about the performance implications and whether it shouldn't be
something controllable by some -m* switch for users like Linux kernel which
want -Os primarily to improve performance and if the push/pop would be
significantly slower they might not appreciate it.