------- Additional Comments From tbptbp at gmail dot com  2005-05-05 23:58 
-------
For future reference, i'm including my end-user offline answer to Uros regarding
always_inline usage.

Here we go:
> I was trying to take a quick look at your bugreport regarding
> always_inline attrubite. Just a quick remark - using only a plain static
> inline bool .... fixes the problem for me and at -O3 code looks like it
Doesn't surprise me.

> should. Is there a specific reason to have an attribute always_inline
> declared for the function you would like to inline? (Please note, that
> Jan Hubicka is currently working in this area.)
Yes, because inline alone in practice is next to useless. You say
below that reg<->mem movements are expensive, but my prime concern in
a hot path is branches.
And if you expect code to be inlined (or more precisely, you expect no
function call) then you have no alternative but to use always_inline.
Tho once you start using always_inline you upset the compiler and you
step in a world of pain where you have to babysit it for dependant
code with combo of always_inline/noinline.

In fact, always_inline/noinline combo are the only kludge for a number
of other problems:
. when gcc gets nuts, they are useful containement measures (so the
sillyness doesn't propagate)
. as said earlier inline being an (ignored) hint, if you have, say a
member function doing just one op (like those intrinsics in the
testcase), it makes absolutely no sense to not inline them. Ever. Yet
some times it happens.
. gcc doesn't like long sequences of branchless vectorized code, which
are quite common, and a static always_inline function is a way to tell
it to look somewhere else.
. those same static always_inline functions also are a way to tell it
to look closer at some code portion and to try to map its working set
into registers; it also has to do with the lack of an unroll pragma
and generally the lack of any directive to tell the compiler to pay
special attention to specific code.

So in the hotpath my code typically ends up being a bunch of
always_inline functions coalesced into a noinline.
For the non speed critical path, i let it up to the compiler. In that
regard, gcc4.x (and specifically gcc4.1) got a lot wiser, perhaps as
good as icc, but obviously not failproof :)



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21195

Reply via email to