[Bug middle-end/21628] GCC much slower than ICL. Lack of inlining?

rguenth at gcc dot gnu dot org Fri, 16 Nov 2007 10:07:25 -0800


------- Comment #3 from rguenth at gcc dot gnu dot org  2007-11-16 18:00 -------
Note that for completely inlining kernels you can use the
__attribute__((flatten))
on the *calling* function.  Usually with expression templates that is the
function
containing the loops, like


void __attribute__((flatten)) doit()
{
  for (;;)
    lots_of_calls_to_inline ();
}

and it will make sure to inline all calls done in doit (recursively, so no
calls
will be left in the final version).  Also starting with GCC 4.2 (and much
improved on trunk which will become 4.3) using profile-feedback will
improve inline performance a lot (use -fprofile-generate, run, -fprofile-use).

I'll close this bug as worksforme as it doesn't have a useful testcase and
from my experience with tramp3d-v4 performance of ICC sucks compared to
GCC because ICC inlines too little ;)


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu dot
                   |                            |org
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |WORKSFORME


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21628

[Bug middle-end/21628] GCC much slower than ICL. Lack of inlining?

Reply via email to