https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960
--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> --- (In reply to Erich Keane from comment #3) > As you know, "extern template" is a hint to the compiler that we don't need > to emit the template as a way to save on compile time. > > Both GCC and clang will NOT instantiate these templates in O0 mode. > However, in O1+ modes, both will actually still instantiate the templates in > the frontend, BUT only for 'inline' functions. Basically, we're using > 'inline' as a heuristic that there is benefit in sending these functions to > the optimizer (basically, sacrificing the compile time gained by 'extern > template' in exchange for a better inlining experience). Hmm, I've seen different behaviours for clang and g++ in this respect, with clang inlining a lot more of std::string's members. So I'm surprised they use the same heuristic. Do they both instantiate the function templates marked 'inline' even at -O1? Presumably not at -O0. > In the submitter's case, the std::string constructor calls "_M_construct". > The constructor is inlined, but _M_construct is not, since it never gets to > the optimizer. > > libc++ uses an __init function to do the same thing as _M_construct, however > IT is marked inline, and thus doesn't have the problem. > > I believe the submitter wants to have you mark more of the functions in > extern-templated classes 'inline' so that it matches the heuristic better. And that's what I don't want to do. I think it's wrong for the human to say "inline this!" because humans are stupid (well, I am anyway). And I don't want to have to examine the GIMPLE/asm again for every new GCC release to decide whether 'inline' is still in the right places (and whether the answer should be different for every different version of Clang or ICC!) And when I say "I don't want to" I mean "I am never ever going to". > I don't think that there is a good way to change the compiler itself without > making 'extern template' absolutely meaningless. I absolutely disagree. It would still give a reduction in object file size for cases where the compiler decides not to inline, and still make compilation much faster for -O0 and -O1. One property of -O2 and -O3 is that we try to optimize aggressively even if that takes a long time to compile. So we could instantiate things that have an explicit instantiation declaration (thus doing "redundant" work) to see if inlining them would be beneficial. That would take longer to compile, but might produce faster code. If the heuristics decide the instantiation ends up too big to inline, it could just discard it (because we know there's a definition elsewhere). If the only way to get that is to mark every function as 'inline' (and then "trick" the compiler into doing all that extra work even at -O1?) then we might as well add 'inline' to every single function template in <string> and <istream>, <ostream>, <streambuf> etc. so they're all potential candiates for inlining. And if we have to mark every single function as 'inline' then maybe the compiler shouldn't be using it as a hint.