Better performance on older version of GCC
Hello all, I have two computers with two different versions of GCC. Otherwise the two systems have identical hardware. I have a processor and memory intensive benchmark program which I compile on both systems and I cannot understand why the system with older GCC version compiles faster code. System A has GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" System B has GCC version "4.3.0 20080428 (Red Hat 4.3.0-8)" I find that the executable compiled on system A runs faster (on both systems) than the executable compiled on system B (on both system), by a factor about approximately 4 times. I have attempted to play with the GCC optimizer flags and have not been able to get System B (with the later GCC version) to compile code with any better performance. Could someone please help figure this out? Below is the GCC command I run on System A followed by the verbose output: gcc -v -Wall -DOFFLINE_WEIGHTS -DDOUBLEP -g bfbenchmark_threaded.c -lm -lrt -lpthread -O3 -o bfbenchmark_threaded ---BEGIN OUTPUT- Using built-in specs. Target: i386-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-languages=c,c++,objc,obj-c ++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-cpu=generic --host=i386-redhat-linux Thread model: posix gcc version 4.1.2 20070925 (Red Hat 4.1.2-33) /usr/libexec/gcc/i386-redhat-linux/4.1.2/cc1 -quiet -v -DOFFLINE_WEIGHTS -DDOUBLEP bfbenchmark_threaded.c -quiet -dumpbase bfbenchmark_threaded.c -mtune=generic -auxbase bfbenchmark_threaded -g -O3 -Wall -version -o /tmp/ccvxPCd0.s ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../i386-redhat-linux/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /usr/lib/gcc/i386-redhat-linux/4.1.2/include /usr/include End of search list. GNU C version 4.1.2 20070925 (Red Hat 4.1.2-33) (i386-redhat-linux) compiled by GNU C version 4.1.2 20070925 (Red Hat 4.1.2-33). GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: ab322ce5b87a7c6c23d60970ec7b7b31 as -V -Qy -o /tmp/ccU8kZL1.o /tmp/ccvxPCd0.s GNU assembler version 2.17.50.0.18 (i386-redhat-linux) using BFD version version 2.17.50.0.18-1 20070731 /usr/libexec/gcc/i386-redhat-linux/4.1.2/collect2 --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -o bfbenchmark_threaded /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crt1.o /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crti.o /usr/lib/gcc/i386-redhat-linux/4.1.2/crtbegin.o -L/usr/lib/gcc/i386-redhat-linux/4.1.2 -L/usr/lib/gcc/i386-redhat-linux/4.1.2 -L/usr/lib/gcc/i386-redhat-linux/4.1.2/../../.. /tmp/ccU8kZL1.o -lm -lrt -lpthread -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i386-redhat-linux/4.1.2/crtend.o /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crtn.o ---END OUTPUT- Below is the GCC command I run on System A followed by the verbose output: gcc -v -Wall -DOFFLINE_WEIGHTS -DDOUBLEP -g bfbenchmark_threaded.c -lm -lrt -lpthread -O3 -o bfbenchmark_threaded ---BEGIN OUTPUT- Using built-in specs. Target: i386-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-cpu=generic --build=i386-redhat-linux Thread model: posix gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) COLLECT_GCC_OPTIONS='-v' '-Wall' '-DOFFLINE_WEIGHTS' '-DDOUBLEP' '-g' '-O3' '-o' 'bfbenchmark_threaded' '-mtune=generic' /usr/libexec/gcc/i386-redhat-linux/4.3.0/cc1 -quiet -v -DOFFLINE_WEIGHTS -DDOUBLEP bfbenchmark_threaded.c -quiet -dumpbase bfbenchmark_threaded.c -mtune=generic -auxbase bfbenchmark_threaded -g -O3 -Wall -version -o /tmp/ccB4B5PI.s ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.3.0/include-fixed" ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.3.0/../../../../i386-redhat-linux/include" #include "..." sear
Re: Better performance on older version of GCC
On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote: > On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote: > > I find that the executable compiled on system A runs faster (on both > > systems) than the executable compiled on system B (on both system), by a > > factor about approximately 4 times. I have attempted to play with the > > GCC optimizer flags and have not been able to get System B (with the > > later GCC version) to compile code with any better performance. Could > > someone please help figure this out? > > It's almost impossible to tell what's going on without an actual > testcase. You might not be able to provide the actual code, but you > could try distilling it down to something you could release. > > -Nathan Thanks for the reply Nathan. I have attached an archive with the test case code. The code is built by build.sh and outputs the number of microseconds to complete the processing. Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces code that runs in about 66% of the time than does GCC version "4.3.0 20080428 (Red Hat 4.3.0-8)" Thanks Corey testbenchmark.100827.1050.tgz Description: application/compressed-tar
Re: Better performance on older version of GCC
On Fri, 2010-08-27 at 17:09 +0200, Richard Guenther wrote: > On Fri, Aug 27, 2010 at 5:02 PM, Corey Kasten > wrote: > > On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote: > >> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote: > >> > I find that the executable compiled on system A runs faster (on both > >> > systems) than the executable compiled on system B (on both system), by a > >> > factor about approximately 4 times. I have attempted to play with the > >> > GCC optimizer flags and have not been able to get System B (with the > >> > later GCC version) to compile code with any better performance. Could > >> > someone please help figure this out? > >> > >> It's almost impossible to tell what's going on without an actual > >> testcase. You might not be able to provide the actual code, but you > >> could try distilling it down to something you could release. > >> > >> -Nathan > > > > Thanks for the reply Nathan. > > > > I have attached an archive with the test case code. The code is built by > > build.sh and outputs the number of microseconds to complete the > > processing. > > > > Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces > > code that runs in about 66% of the time than does GCC version "4.3.0 > > 20080428 (Red Hat 4.3.0-8)" > > -fcx-limited-range or -fcx-fortran-rules. 4.3 now is more conforming than > 4.1. > > Richard. > > > Thanks > > > > Corey > > Richard, -fcx-limited-range worked great on both my real benchmark and my test achive. GCC didn't recognize -fcx-fortran-rules, but obviously I don't need it. Thanks so much, Corey