http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35926
--- Comment #5 from Tony Poppleton <tony.poppleton at gmail dot com> 2011-01-25
19:33:20 UTC ---
I can confirm this still exists on both GCC 4.5.1 and GCC 4.6.0 (20110115),
when compiling with -O3.
I did some basic investigation into the files produced by the --dump-tree-all
flag, which shows:
add (struct toto_s * a, struct toto_s * b)
{
int64_t tmp;
int D.2686;
struct toto_s * D.2685;
long long int D.2684;
int D.2683;
int b.1;
long long int D.2681;
int a.0;
<bb 2>:
a.0_2 = (int) a_1(d);
D.2681_3 = (long long int) a.0_2;
b.1_5 = (int) b_4(d);
D.2683_6 = b.1_5 & -2;
D.2684_7 = (long long int) D.2683_6;
tmp_8 = D.2681_3 + D.2684_7;
D.2686_9 = (int) tmp_8;
D.2685_10 = (struct toto_s *) D.2686_9;
return D.2685_10;
}
What I don't understand here, is the excessive casting;
The addition in tmp_8 is being done as (long long int), yet both terms are
ultimiately derived from (int) variables. Is this casting to (long long int)
necessary to deal with an overflow on the addition?
If so, then why does the final asm code not appear to be catering for overflow?
Alternatively, could the whole block be simplified down to (int) during this
phase of the compile, thereby fixing the subsequent unnecessary usage of BX
during the RTL phase (as per comment #3)?
As an aside (possibly another bug report?), it appears there is a regression in
4.6.0, which requires the use of an additional movl compared to what is in the
original bug description (4.5.1 does not suffer from this)
.file "PR35926.c"
.text
.p2align 4,,15
.globl add
.type add, @function
add:
.LFB0:
.cfi_startproc
pushl %ebx
.cfi_def_cfa_offset 8
.cfi_offset 3, -8
movl 12(%esp), %eax
movl 8(%esp), %ecx
popl %ebx
.cfi_def_cfa_offset 4
.cfi_restore 3
andl $-2, %eax
addl %eax, %ecx <==== order of regs inverted
movl %ecx, %eax <==== resulting in unnecessary movl
ret
.cfi_endproc
.LFE0:
.size add, .-add
.ident "GCC: (GNU) 4.6.0 20110115 (experimental)"
.section .note.GNU-stack,"",@progbits