------- Comment #5 from jacob at math dot jussieu dot fr  2006-12-13 20:22 
-------
Nope... with -O3 -ffast-math I get 1.9 seconds in average (this is a laptop
with CPU frequency scaling, so it's difficult to get precise numbers). Adding
-funroll-loops in addition to -ffast-math doesn't seem to make a difference.
We're very far from the 0.3 seconds I get with -DUNROLL.

Also, trying again -O3 -funroll-loops, I get again 1.9 seconds, so I think
-funroll-loops didn't make any difference and I had been fooled by CPU
frequency scaling.

The problem with the multiplication is not important to me, it's just something
I used in this example. I could as well have written

    for( int i = 0; i < 3; i++ )
        for( int j = 0; j < 3; j++ )
            (*this)(i, j) = (i == j) ? factor : 0;

But this turns out to be even slower. I presume that's because, as the loops
don't get both unrolled, the test i==j ?: makes branches at run-time.

Anyway thanks for being supportive and having looked into my problem. May I ask
again, can I hope for a fully-unrolling-nested-loops g++ in the near future?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30201

Reply via email to