With these compile options
-Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2 -fno-trapping-math
-fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC -fno-common -mieee-fp
With this compiler:
euler-44% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release --with-gmp=/pkgs/gmp-4.2.2
--with-mpfr=/pkgs/gmp-4.2.2
Thread model: posix
gcc version 4.3.0 20071026 (experimental) [trunk revision 129664] (GCC)
With the following routine compiled with gcc-4.2.2 you get
(time (direct-fft-recursive-4 a table))
366 ms real time
366 ms cpu time (366 user, 0 system)
no collections
64 bytes allocated
no minor faults
no major faults
while with today's mainline you get
(time (direct-fft-recursive-4 a table))
448 ms real time
448 ms cpu time (448 user, 0 system)
no collections
64 bytes allocated
no minor faults
no major faults
I've isolated that one routine and I'll add it at the end of an attachment;
unfortunately there are a lot of declarations and global data that are
difficult to winnow.
There is really only one main loop in the routine, the one that begins at
___L19_direct_2d_fft_2d_recursive_2d_4. This loop was scheduled in 102 cycles
(sched2) on 4.4.2 and in 134 cycles in mainline.
--
Summary: 33% performance slowdown from 4.2.2 in floating-point
code
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: lucier at math dot purdue dot edu
GCC build triplet: x86_64-unknown-linux-gnu
GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33928