http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477
Summary: [4.6 regression] Sub-optimal mov at end of method
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: [email protected]
ReportedBy: [email protected]
Host: Linux x86-64
Whilst investigating PR35926, I noticed a slight inefficiency in code generated
by 4.6.0 (20110115) versus that of 4.5.1.
Duplicating the C code here from that PR for easy reference:
typedef struct toto_s *toto_t;
toto_t add (toto_t a, toto_t b) {
int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
return (toto_t)(intptr_t) tmp;
}
The ASM generated by 4.6.0 with flags -O3 is:
.file "PR35926.c"
.text
.p2align 4,,15
.globl add
.type add, @function
add:
.LFB0:
.cfi_startproc
pushl %ebx
.cfi_def_cfa_offset 8
.cfi_offset 3, -8
movl 12(%esp), %eax
movl 8(%esp), %ecx
popl %ebx
.cfi_def_cfa_offset 4
.cfi_restore 3
andl $-2, %eax
addl %eax, %ecx <==== order of regs inverted
movl %ecx, %eax <==== resulting in unnecessary movl
ret
.cfi_endproc
.LFE0:
.size add, .-add
.ident "GCC: (GNU) 4.6.0 20110115 (experimental)"
.section .note.GNU-stack,"",@progbits
In 4.5.1, the last bit is one instruction shorter, with just:
addl %ecx, %eax
ret
A bug search revealed a similar sounding PR44249, however that is a regression
in 4.5 too apparently, yet this only affects 4.6.