[Bug other/49194] Trivially stupid inlining decisions

torva...@linux-foundation.org Fri, 27 May 2011 09:39:32 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49194


--- Comment #6 from Linus Torvalds <torva...@linux-foundation.org> 2011-05-27 
16:38:22 UTC ---
(In reply to comment #3)
> 
> -finline-functions-called-once  is trottled down by the large-function-growth
> and large-stack-frame-growth limits. The  Kernel case coupld proably be 
> handled
> by the second. Does kernel bump down that limits?

We used to play with inlining limits (gcc had some really bad decisions), but
the meaning of the numbers kept changing from one gcc version to another, and
the heuristics gcc used kept changing too. Which made it practically impossible
to use sanely - you could tweak it for one particular architecture, and one
particular version of gcc, but it would then be worse for others.

Quite frankly, with that kind of history, I'm not very eager to start playing
around with random gcc internal variables again.

So I'd much rather have gcc have good heuristics by default, possibly helped by
the kinds of obvious hints we can give ("unlikely()" in particular is something
we can add for things like this).

Obviously, we can (and do) use the "force the decision" with either "noinline"
or "__always_inline" (which are just the kernel macros to make the gcc
attribute syntax slightly more readable), but since I've been doing those other
bug reports about bad gcc code generation, I thought I'd point out this one
too.

> It still won't help in case function doesn't have any on-stack aggregates,
> since we optimistically assume that all gimple registers will disappear.
> Probably
> even that could be change, though estimating reload's stack frame usage so
> early would
> be iffy.

Yes, early stack estimation might not work all that well.

In the kernel, we do end up having a few complex functions that we basically
expect to inline to almost nothing - simply because we end up depending on
compile-time constant issues (sometimes very explicitly, with
__builtin_constant_p() followed by a largish "switch ()" statement).

That said, this is something where the call-site really can make a big
difference. Not just the fact that the call site might be marked "unlikely()"
(again, that's just the kernel making __builtin_expect() readable), but things
like "none of the arguments are constants" could easily be a good heuristic to
use as a basis for whether to inline or not.

IOW, start out with whatever 'large-stack-frame-growth' and
'large-function-growth' values, but if the call-site is in an unlikely region,
cut those values in half (or whatever). And if none of the arguments are
constants, cut it in half again.

This is an example of why giving these limits as compiler options really
doesn't work: the choice should probably be much more dynamic than just a
single number.

I dunno. As mentioned, we can fix this problem by just marking things noinline
by hand. But I do think that there are fairly obvious cases where inlining
really isn't worth it, and gcc might as well just get those cases right.

[Bug other/49194] Trivially stupid inlining decisions

Reply via email to