the following two equivalent functions are compiled into different asm-code.
The bad thing is, that the more readable function (get_and_increment2) creates
worse code. It is bigger and slower. This is because it uses one register more
than the more optimized version get_and_increment1.
struct IntPtr
{
int* m_ReadPtr;
};
int get_and_increment1(struct IntPtr* i)
{
return *(i->m_ReadPtr++);
}
int get_and_increment2(struct IntPtr* i)
{
i->m_ReadPtr++;
return *(i->m_ReadPtr - 1);
}
00000000 <get_and_increment1>:
0: 81 23 00 00 lwz r9,0(r3)
4: 80 09 00 00 lwz r0,0(r9)
8: 39 29 00 04 addi r9,r9,4
c: 91 23 00 00 stw r9,0(r3)
10: 7c 03 03 78 mr r3,r0
14: 4e 80 00 20 blr
00000018 <get_and_increment2>:
18: 81 23 00 00 lwz r9,0(r3)
1c: 39 29 00 04 addi r9,r9,4
20: 91 23 00 00 stw r9,0(r3)
24: 80 69 ff fc lwz r3,-4(r9)
28: 4e 80 00 20 blr
--
Summary: missed optimization for pointer access with offset on
powerpc
Product: gcc
Version: 4.2.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rbuergel at web dot de
GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: powerpc-linux-uclibc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36693