http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Bug #: 54000 Summary: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: benedict.ge...@ins.uni-bonn.de Dear experts, I got here from the gcc-help mailing list but have not submitted any bug reports before. So I hope for your patience. In a self-written library used for numerical computations we have some typical programs serving as benchmarks for new compiler versions or optimization flags. When gcc-4.6 was released we noticed a performance breakdown. The problem persisted with gcc-4.7. I tried to produce a minimal stand-alone example and followed the instructions at http://gcc.gnu.org/bugs/minimize.html. As std::vector is included I was however not able to arrive at a really small file. What you see at the end of the file is actually just 1000 times matrix-vector multiplication. However the matrix has a highly specific structure which is encountered when performing numerical computations using the Finite Element Method (FEM), i.e.: std::vector<MinimalVec3> rows[9]; Thus it consists of 9 bands of triples of doubles. The length of each band corresponds to the length of the vector it is applied to. Compiling with gcc-4.5.0 (our current standard) 'time' command gives: real 1m32.606s Using gcc-4.7.0 we get: real 2m6.923s When removing member variable "double stuff" in "class MinimalVector" and using gcc-4.7.0 we get: real 1m27.354s Using a C array instead of std::vector above resolves this issue. The specifications of the two compilers used are: Using built-in specs. COLLECT_GCC=/home/prog/gcc-4.5.0-64/bin/g++ COLLECT_LTO_WRAPPER=/home/prog/gcc-4.5.0-64/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --prefix=/home/prog/gcc-4.5.0-64/ --enable-languages=c,c++,fortran --disable-multilib --enable-lto --with-libelf=/home/prog/libelf-64/ --with-ppl=/home/prog/ppl-64/ --with-cloog=/home/prog/cloog-ppl-64/ Thread model: posix gcc version 4.5.0 (GCC) and Using built-in specs. COLLECT_GCC=/home/prog/gcc-4.7.0-64/bin/g++ COLLECT_LTO_WRAPPER=/home/prog/gcc-4.7.0-64/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.7.0/configure --prefix=/home/prog/gcc-4.7.0-64/ --enable-languages=c,c++,fortran --with-gmp=/home/prog/gmp-5.0.4-64/ --with-ppl=/home/prog/ppl-0.12-64/ --enable-cloog-backend=isl --with-cloog=/home/prog/cloog-0.16.3-64/ --disable-multilib --enable-libstdcxx-debug Thread model: posix gcc version 4.7.0 (GCC) They have been compiled manually on a machine running openSuse 11.3. The command line was: g++ -O2 -o minmvmult minmvmult.ii There were no warnings or error messages. We'd be grateful for any suggestions. Best regards Benedict