Better performance on older version of GCC

2010-08-27 Thread Corey Kasten
Hello all,

I have two computers with two different versions of GCC. Otherwise the
two systems have identical hardware. I have a processor and memory
intensive benchmark program which I compile on both systems and I cannot
understand why the system with older GCC version compiles faster code. 

System A has GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)"
System B has GCC version "4.3.0 20080428 (Red Hat 4.3.0-8)"

I find that the executable compiled on system A runs faster (on both
systems) than the executable compiled on system B (on both system), by a
factor about approximately 4 times. I have attempted to play with the
GCC optimizer flags and have not been able to get System B (with the
later GCC version) to compile code with any better performance. Could
someone please help figure this out?

Below is the GCC command I run on System A followed by the verbose
output:
gcc -v -Wall -DOFFLINE_WEIGHTS -DDOUBLEP -g bfbenchmark_threaded.c -lm
-lrt -lpthread -O3 -o bfbenchmark_threaded

---BEGIN OUTPUT-
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-languages=c,c++,objc,obj-c
++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-cpu=generic
--host=i386-redhat-linux
Thread model: posix
gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)
 /usr/libexec/gcc/i386-redhat-linux/4.1.2/cc1 -quiet -v
-DOFFLINE_WEIGHTS -DDOUBLEP bfbenchmark_threaded.c -quiet -dumpbase
bfbenchmark_threaded.c -mtune=generic -auxbase bfbenchmark_threaded -g
-O3 -Wall -version -o /tmp/ccvxPCd0.s
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../i386-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc/i386-redhat-linux/4.1.2/include
 /usr/include
End of search list.
GNU C version 4.1.2 20070925 (Red Hat 4.1.2-33) (i386-redhat-linux)
compiled by GNU C version 4.1.2 20070925 (Red Hat 4.1.2-33).
GGC heuristics: --param ggc-min-expand=100 --param
ggc-min-heapsize=131072
Compiler executable checksum: ab322ce5b87a7c6c23d60970ec7b7b31
 as -V -Qy -o /tmp/ccU8kZL1.o /tmp/ccvxPCd0.s
GNU assembler version 2.17.50.0.18 (i386-redhat-linux) using BFD version
version 2.17.50.0.18-1 20070731
 /usr/libexec/gcc/i386-redhat-linux/4.1.2/collect2 --eh-frame-hdr
--build-id -m elf_i386 --hash-style=gnu
-dynamic-linker /lib/ld-linux.so.2 -o
bfbenchmark_threaded /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crt1.o 
/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crti.o 
/usr/lib/gcc/i386-redhat-linux/4.1.2/crtbegin.o 
-L/usr/lib/gcc/i386-redhat-linux/4.1.2 -L/usr/lib/gcc/i386-redhat-linux/4.1.2 
-L/usr/lib/gcc/i386-redhat-linux/4.1.2/../../.. /tmp/ccU8kZL1.o -lm -lrt 
-lpthread -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed 
-lgcc_s --no-as-needed /usr/lib/gcc/i386-redhat-linux/4.1.2/crtend.o 
/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crtn.o
---END OUTPUT-



Below is the GCC command I run on System A followed by the verbose
output:
gcc -v -Wall -DOFFLINE_WEIGHTS -DDOUBLEP -g bfbenchmark_threaded.c -lm
-lrt -lpthread -O3 -o bfbenchmark_threaded

---BEGIN OUTPUT-
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap
--enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--disable-libjava-multilib --with-cpu=generic --build=i386-redhat-linux
Thread model: posix
gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-Wall' '-DOFFLINE_WEIGHTS' '-DDOUBLEP' '-g'
'-O3' '-o' 'bfbenchmark_threaded' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.0/cc1 -quiet -v
-DOFFLINE_WEIGHTS -DDOUBLEP bfbenchmark_threaded.c -quiet -dumpbase
bfbenchmark_threaded.c -mtune=generic -auxbase bfbenchmark_threaded -g
-O3 -Wall -version -o /tmp/ccB4B5PI.s
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.3.0/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.3.0/../../../../i386-redhat-linux/include"
#include "..." sear

Re: Better performance on older version of GCC

2010-08-27 Thread Corey Kasten
On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote:
> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
> > I find that the executable compiled on system A runs faster (on both
> > systems) than the executable compiled on system B (on both system), by a
> > factor about approximately 4 times. I have attempted to play with the
> > GCC optimizer flags and have not been able to get System B (with the
> > later GCC version) to compile code with any better performance. Could
> > someone please help figure this out?
> 
> It's almost impossible to tell what's going on without an actual
> testcase.  You might not be able to provide the actual code, but you
> could try distilling it down to something you could release.
> 
> -Nathan

Thanks for the reply Nathan.

I have attached an archive with the test case code. The code is built by
build.sh and outputs the number of microseconds to complete the
processing.

Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces
code that runs in about 66% of the time than does GCC version "4.3.0
20080428 (Red Hat 4.3.0-8)"

Thanks

Corey


testbenchmark.100827.1050.tgz
Description: application/compressed-tar


Re: Better performance on older version of GCC

2010-08-27 Thread Corey Kasten
On Fri, 2010-08-27 at 17:09 +0200, Richard Guenther wrote:
> On Fri, Aug 27, 2010 at 5:02 PM, Corey Kasten
>  wrote:
> > On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote:
> >> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
> >> > I find that the executable compiled on system A runs faster (on both
> >> > systems) than the executable compiled on system B (on both system), by a
> >> > factor about approximately 4 times. I have attempted to play with the
> >> > GCC optimizer flags and have not been able to get System B (with the
> >> > later GCC version) to compile code with any better performance. Could
> >> > someone please help figure this out?
> >>
> >> It's almost impossible to tell what's going on without an actual
> >> testcase.  You might not be able to provide the actual code, but you
> >> could try distilling it down to something you could release.
> >>
> >> -Nathan
> >
> > Thanks for the reply Nathan.
> >
> > I have attached an archive with the test case code. The code is built by
> > build.sh and outputs the number of microseconds to complete the
> > processing.
> >
> > Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces
> > code that runs in about 66% of the time than does GCC version "4.3.0
> > 20080428 (Red Hat 4.3.0-8)"
> 
> -fcx-limited-range or -fcx-fortran-rules.  4.3 now is more conforming than 
> 4.1.
> 
> Richard.
> 
> > Thanks
> >
> > Corey
> >

Richard,

-fcx-limited-range worked great on both my real benchmark and my test
achive. GCC didn't recognize -fcx-fortran-rules, but obviously I don't
need it.

Thanks so much,
Corey