I noticed a performance regression on the following code:
$ cat a.c
#include <stdint.h>
#include <stdio.h>
void
add256 (uint64_t x[4], const uint64_t y[4])
{
unsigned char carry;
x[0] += y[0];
carry = (x[0] < y[0]);
x[1] += y[1]+carry;
carry = carry ? (x[1] <= y[1]) : (x[1] < y[1]);
x[2] += y[2]+carry;
carry = carry ? (x[2] <= y[2]) : (x[2] < y[2]);
x[3] += y[3]+carry;
}
int
main (void)
{
int i;
uint64_t x[4], y[4];
x[0] = 0; x[1] = 0; x[2] = 0; x[3] = 0;
y[0] = 0x0123456789abcdefULL;
y[1] = 0xfedcba9876543210ULL;
y[2] = 0xdeadbeeff001baadULL;
y[3] = 0x001001001001ffffULL;
for ( i=0 ; i<100000000 ; i++ )
add256 (x, y);
printf ("%016llx%016llx%016llx%016llx\n",
(unsigned long long)x[3],
(unsigned long long)x[2],
(unsigned long long)x[1],
(unsigned long long)x[0]);
return 0;
}
$ gcc -march=pentium4 -O3 a.c && time ./a.out
064069fbc13963b920219c3e939225e38e38e38e3956d81c71c71c71c0ba0f00
./a.out 1.81s user 0.00s system 99% cpu 1.818 total
$ gcc-4.3 -march=pentium4 -O3 a.c && time ./a.out
064069fbc13963b920219c3e939225e38e38e38e3956d81c71c71c71c0ba0f00
./a.out 2.40s user 0.01s system 87% cpu 2.746 total
where gcc is gcc version 4.1.1 20070105 (Red Hat 4.1.1-51) and gcc-4.3
is gcc version 4.3.0 20070209 (experimental).
Pawel Sikora confirmed he's seeing the same kind of regression between 4.2 and
4.3 (http://gcc.gnu.org/ml/gcc/2007-02/msg00319.html)
--
Summary: 4.3 performance regression on uint64_t operations
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: fxcoudert at gcc dot gnu dot org
GCC build triplet: x86_64-pc-linux-gnu
GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30801