------- Comment #6 from laurent at ient dot rwth-aachen dot de 2007-11-16 20:42 ------- > Note that for completely inlining kernels you can use the > __attribute__((flatten)) > on the *calling* function. Usually with expression templates that is the > function > containing the loops, like > void __attribute__((flatten)) doit() > { > for (;;) > lots_of_calls_to_inline (); > } > and it will make sure to inline all calls done in doit (recursively, so no > calls > will be left in the final version). Also starting with GCC 4.2 (and much > improved on trunk which will become 4.3) using profile-feedback will > improve inline performance a lot (use -fprofile-generate, run, -fprofile-use). Good to know! Thanks for the advices!
> But we are better in freedom. ;-) Much better! > OK. So this is fixed. Thanks for the report nonetheless. And sorry for the > delay. No problemo. Thank to all of you. > I don't think all the inlining improvements (many) can be traced back to any > specific individual complaining (not even Linus Torvalds ;) (Ups! sorry for having misspelled the name of Linus Torvalds!) You are most probably right. I was nevertheless happy to notice I was not alone to complain about the problem. > Details would be certainly welcome. Ideally, a reduced snippet, to pursue the > optimization people to take action reasonably quickly... Hmm, difficult. I just sometimes compare execution speed of numerical calculations from different compilers (ICL,VC2005,GCC), and ICL is often quicker by maybe 10%. If I have more specific and easier examples, I'll post them. I especially appreciate the way GCC notifies the compilation errors from deep nested templates. I could not have programmed deep nested template expression with the complicated error messages form ICL or VC2005. I have to say that ICL has obviously not respected the __forceinline directive any more since the version 9 and 10, this is for me a clear regression. I do not know exactly the changes in these latest versions, but I do not want to exchange with my good old version 8.1. Thanks -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21628