$ uname -a Linux oblomow 2.6.31.12-0.2-desktop #1 SMP PREEMPT 2010-03-16 21:25:39 +0100 x86_64 x86_64 x86_64 GNU/Linux $ g++-4.5 -v Using built-in specs. COLLECT_GCC=g++-4.5 COLLECT_LTO_WRAPPER=/opt/gcc-4.5.0/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.5.0/configure --with-mpc=/opt/gcc-4.5.0 --enable-languages=c,c++ --prefix=/opt/gcc-4.5.0 --program-suffix=-4.5 Thread model: posix gcc version 4.5.0 (GCC)
The following program computes fib(n) with NN identical, mutually recursive functions. #include <stdio.h> template<int k,int N> struct F { static int f(int n) { if (n<2) return n; return F<(k+1)%N,N>::f(n-2)+F<(k+1)%N,N>::f(n-1); } }; int main() { printf("NN=%d, res=%d\n",NN,F<0,NN>::f(44)); } For NN=1 only one function is used, for NN=2 F<0,2>:f(int) calls F<1,2>:f(int) and vice versa, etc.pp. The running time variance under different values for NN is **suprising**: > for x in {1..15} ; do g++-4.5 -O3 -DNN=$x f.cc -o f$x ; time ./f$x ; echo > "---"; done NN=1, res=701408733 real 0m4.617s user 0m4.606s sys 0m0.003s --- NN=2, res=701408733 real 0m2.361s user 0m2.358s sys 0m0.002s --- NN=3, res=701408733 real 0m3.267s user 0m3.264s sys 0m0.001s --- NN=4, res=701408733 real 0m1.191s user 0m1.183s sys 0m0.001s --- NN=5, res=701408733 real 0m0.949s user 0m0.946s sys 0m0.002s --- NN=6, res=701408733 real 0m1.541s user 0m1.540s sys 0m0.000s --- NN=7, res=701408733 real 0m1.475s user 0m1.459s sys 0m0.002s --- NN=8, res=701408733 real 0m0.509s user 0m0.499s sys 0m0.001s --- NN=9, res=701408733 real 0m0.723s user 0m0.720s sys 0m0.002s --- NN=10, res=701408733 real 0m1.566s user 0m1.558s sys 0m0.001s --- NN=11, res=701408733 real 0m1.871s user 0m1.868s sys 0m0.002s --- NN=12, res=701408733 real 0m0.806s user 0m0.803s sys 0m0.001s --- NN=13, res=701408733 real 0m1.735s user 0m1.735s sys 0m0.000s --- NN=14, res=701408733 real 0m1.282s user 0m1.281s sys 0m0.000s --- NN=15, res=701408733 real 0m1.916s user 0m1.906s sys 0m0.002s --- The NN=1 case is 9x slower than NN=8, which is again 3x faster than NN=7 and NN=10, even though it is the same computation in all cases. Great news is, that the figures themselves are much better than with gcc-4.4, but the variance bothers me a bit. -- Summary: Recusive functions optimize in unpredictable ways Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: marco at technoboredom dot net GCC host triplet: x86_64-suse-linux GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44822