$ uname -a
Linux oblomow 2.6.31.12-0.2-desktop #1 SMP PREEMPT 2010-03-16 21:25:39 +0100
x86_64 x86_64 x86_64 GNU/Linux
$ g++-4.5 -v
Using built-in specs.
COLLECT_GCC=g++-4.5
COLLECT_LTO_WRAPPER=/opt/gcc-4.5.0/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.5.0/configure --with-mpc=/opt/gcc-4.5.0
--enable-languages=c,c++ --prefix=/opt/gcc-4.5.0 --program-suffix=-4.5
Thread model: posix
gcc version 4.5.0 (GCC)

The following program computes fib(n) with NN identical, mutually recursive
functions.

#include <stdio.h>

template<int k,int N>
struct F
{
        static int f(int n) 
        {
                if (n<2) return n; 
                return F<(k+1)%N,N>::f(n-2)+F<(k+1)%N,N>::f(n-1);
        }
};

int main()
{
        printf("NN=%d, res=%d\n",NN,F<0,NN>::f(44));
}

For NN=1 only one function is used, for NN=2 F<0,2>:f(int) calls F<1,2>:f(int)
and vice versa, etc.pp. The running time variance under different values for NN
is **suprising**:

> for x in {1..15} ; do g++-4.5 -O3 -DNN=$x f.cc -o f$x ; time ./f$x ; echo 
> "---"; done
NN=1, res=701408733                                                             

real    0m4.617s
user    0m4.606s
sys     0m0.003s
---             
NN=2, res=701408733

real    0m2.361s
user    0m2.358s
sys     0m0.002s
---             
NN=3, res=701408733

real    0m3.267s
user    0m3.264s
sys     0m0.001s
---             
NN=4, res=701408733

real    0m1.191s
user    0m1.183s
sys     0m0.001s
---             
NN=5, res=701408733

real    0m0.949s
user    0m0.946s
sys     0m0.002s
---             
NN=6, res=701408733

real    0m1.541s
user    0m1.540s
sys     0m0.000s
---             
NN=7, res=701408733

real    0m1.475s
user    0m1.459s
sys     0m0.002s
---
NN=8, res=701408733

real    0m0.509s
user    0m0.499s
sys     0m0.001s
---
NN=9, res=701408733

real    0m0.723s
user    0m0.720s
sys     0m0.002s
---
NN=10, res=701408733

real    0m1.566s
user    0m1.558s
sys     0m0.001s
---
NN=11, res=701408733

real    0m1.871s
user    0m1.868s
sys     0m0.002s
---
NN=12, res=701408733

real    0m0.806s
user    0m0.803s
sys     0m0.001s
---
NN=13, res=701408733

real    0m1.735s
user    0m1.735s
sys     0m0.000s
---
NN=14, res=701408733

real    0m1.282s
user    0m1.281s
sys     0m0.000s
---
NN=15, res=701408733

real    0m1.916s
user    0m1.906s
sys     0m0.002s
---

The NN=1 case is 9x slower than NN=8, which is again 3x faster than NN=7 and
NN=10, even though it is the same computation in all cases. Great news is, that
the figures themselves are much better than with gcc-4.4, but the variance
bothers me a bit.


-- 
           Summary: Recusive functions optimize in unpredictable ways
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: marco at technoboredom dot net
  GCC host triplet: x86_64-suse-linux
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44822

Reply via email to