------- Comment #3 from rguenth at gcc dot gnu dot org 2007-11-16 18:00 ------- Note that for completely inlining kernels you can use the __attribute__((flatten)) on the *calling* function. Usually with expression templates that is the function containing the loops, like
void __attribute__((flatten)) doit() { for (;;) lots_of_calls_to_inline (); } and it will make sure to inline all calls done in doit (recursively, so no calls will be left in the final version). Also starting with GCC 4.2 (and much improved on trunk which will become 4.3) using profile-feedback will improve inline performance a lot (use -fprofile-generate, run, -fprofile-use). I'll close this bug as worksforme as it doesn't have a useful testcase and from my experience with tramp3d-v4 performance of ICC sucks compared to GCC because ICC inlines too little ;) -- rguenth at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu dot | |org Status|UNCONFIRMED |RESOLVED Resolution| |WORKSFORME http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21628