This is very related to this bug (43892):
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43892
There are many ways to try to add with carry - and difficult to catch them all.
I really 'tried to think like a compiler' when I wrote the following
(C++ Intel 32bit code) code:
(not even strict correctly c++. It won't work with AMD64 - since long long is
64 bit - just like unsigned long - and __int128 isn't quite there yet).
// Data structures:
struct Skew1Even
{
unsigned long long data; // This could be an array
unsigned long unused;
};
struct Skew2Odd
{
unsigned long unused;
unsigned long long data; // This could be an array
};
struct ULongLongLong
{
union
{
unsigned long m_data[3];
Skew1 m_rep1;
Skew2 m_rep2;
};
ULongLongLong()
{
m_data[0]=0;
m_data[1]=0;
m_data[2]=0;
}
// void print() { std::cout << m_data[0] << "," << m_data[1] << "," << //
m_data[2] << "\n";}
void addtest(const ULongLongLong &b); // operator +=
};
The addtest is the important part:
void ULongLongLong::addtest(const ULongLongLong &b)
{
// if (this==&b) // removed to make the example easier
//doTimes2();
m_rep1.data+=b.m_data[0];
m_rep2.data+=b.m_data[1];
m_data[2]+=b.m_data[2];
}
The main point in my code is also in the compiled code (but not used by the
compiler). What I hoped to happen was that gcc saw that adding 0 with carry
'quickly' followed by a normal add would be the same as just the last add (but)
with carry.
I however only get the code:
.globl _ZN13ULongLongLong7addtestERKS_
.type _ZN13ULongLongLong7addtestERKS_, @function
_ZN13ULongLongLong7addtestERKS_:
.LFB964:
.cfi_startproc
.cfi_personality 0x0,__gxx_personality_v0
pushl %ebp
.cfi_def_cfa_offset 8
movl%esp, %ebp
.cfi_offset 5, -8
.cfi_def_cfa_register 5
movl12(%ebp), %edx
movl8(%ebp), %eax
pushl %ebx
xorl%ebx, %ebx
.cfi_offset 3, -12
movl(%edx), %ecx
addl%ecx, (%eax)
adcl%ebx, 4(%eax)
xorl%ebx, %ebx
movl4(%edx), %ecx
addl%ecx, 4(%eax)
adcl%ebx, 8(%eax)
movl8(%edx), %edx
addl%edx, 8(%eax)
popl%ebx
popl%ebp
ret
.cfi_endproc
What I wanted was this code:
globl _ZN13ULongLongLong7addtestERKS_
.type _ZN13ULongLongLong7addtestERKS_, @function
_ZN13ULongLongLong7addtestERKS_:
.LFB1001:
.cfi_startproc
.cfi_personality 0x0,__gxx_personality_v0
pushl %ebp
.cfi_def_cfa_offset 8
movl%esp, %ebp
.cfi_offset 5, -8
.cfi_def_cfa_register 5
movl12(%ebp), %edx
movl8(%ebp), %eax
/* pushl %ebx */ /* not needed anymore - we don't use it */
/* xorl%ebx, %ebx No need to reset ebx */
.cfi_offset 3, -12
movl(%edx), %ecx
addl%ecx, (%eax)
/* adcl%ebx, 4(%eax) */
/* xorl%ebx, %ebx Why do it at all - ebx was already 0 !?*/
movl4(%edx), %ecx
adcl%ecx, 4(%eax) /* modified addl to adcl */
/* adcl%ebx, 8(%eax) */
movl8(%edx), %edx
adcl%edx, 8(%eax) /* modified addl to adcl */
/* popl%ebx */
popl%ebp
ret
.cfi_endproc
However - the code I want is:
Note: It seems like adding could be replaced with subtraction.
It may still be better to make carry work a bit more in general - and I
understand that this might be a won't fix - especially if you provide a clear
way to add with carry in general.
However this might just be a much easier peephole(-like) optimization.
PS: Thanks for a really great compiler.
--
Summary: Add with carry - missed optimization
Product: gcc
Version: 4.4.1
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: tmartsum at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45548