https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Erich Keane from comment #3)
> As you know, "extern template" is a hint to the compiler that we don't need
> to emit the template as a way to save on compile time.
> 
> Both GCC and clang will NOT instantiate these templates in O0 mode. 
> However, in O1+ modes, both will actually still instantiate the templates in
> the frontend, BUT only for 'inline' functions.  Basically, we're using
> 'inline' as a heuristic that there is benefit in sending these functions to
> the optimizer (basically, sacrificing the compile time gained by 'extern
> template' in exchange for a better inlining experience).

Hmm, I've seen different behaviours for clang and g++ in this respect, with
clang inlining a lot more of std::string's members. So I'm surprised they use
the same heuristic.

Do they both instantiate the function templates marked 'inline' even at -O1?
Presumably not at -O0.

> In the submitter's case, the std::string constructor calls "_M_construct". 
> The constructor is inlined, but _M_construct is not, since it never gets to
> the optimizer.
> 
> libc++ uses an __init function to do the same thing as _M_construct, however
> IT is marked inline, and thus doesn't have the problem.
> 
> I believe the submitter wants to have you mark more of the functions in
> extern-templated classes 'inline' so that it matches the heuristic better.

And that's what I don't want to do. I think it's wrong for the human to say
"inline this!" because humans are stupid (well, I am anyway). And I don't want
to have to examine the GIMPLE/asm again for every new GCC release to decide
whether 'inline' is still in the right places (and whether the answer should be
different for every different version of Clang or ICC!)

And when I say "I don't want to" I mean "I am never ever going to".

> I don't think that there is a good way to change the compiler itself without
> making 'extern template' absolutely meaningless.

I absolutely disagree.

It would still give a reduction in object file size for cases where the
compiler decides not to inline, and still make compilation much faster for -O0
and -O1.

One property of -O2 and -O3 is that we try to optimize aggressively even if
that takes a long time to compile. So we could instantiate things that have an
explicit instantiation declaration (thus doing "redundant" work) to see if
inlining them would be beneficial. That would take longer to compile, but might
produce faster code. If the heuristics decide the instantiation ends up too big
to inline, it could just discard it (because we know there's a definition
elsewhere).

If the only way to get that is to mark every function as 'inline' (and then
"trick" the compiler into doing all that extra work even at -O1?) then we might
as well add 'inline' to every single function template in <string> and
<istream>, <ostream>, <streambuf> etc. so they're all potential candiates for
inlining.

And if we have to mark every single function as 'inline' then maybe the
compiler shouldn't be using it as a hint.

Reply via email to