https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119387
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=114563 --- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Patrick Palka from comment #6) > Strangely, it seems to have started with r14-5979 "c++: P2280R4, Using > unknown refs in constant expr [PR106650]". > > GCC trunk -ftime-report (with -O2 -g): > > callgraph construction : 257.75 ( 66%) 444M ( 5%) > template instantiation : 18.99 ( 5%) 2082M ( 22%) > constant expression evaluation : 35.33 ( 9%) 5799M ( 61%) > TOTAL : 391.71 9484M > > GCC trunk -ftime-report (with -O2 -g), r14-5979 reverted: > > callgraph construction : 11.57 ( 23%) 444M ( 12%) > template instantiation : 18.66 ( 38%) 2082M ( 54%) > constant expression evaluation : 1.59 ( 3%) 147M ( 4%) > TOTAL : 49.53 3839M > > With just -O2 -fsyntax-only, there's also >3x increase in peak memory usage, > 2.2GB vs 6.8GB. For me, a not up-to-date trunk with release checking with -O2: phase opt and generate : 12.86 ( 20%) 164M ( 2%) callgraph construction : 8.28 ( 13%) 55M ( 1%) template instantiation : 13.50 ( 21%) 2078M ( 24%) constant expression evaluation : 32.17 ( 51%) 5799M ( 68%) TOTAL : 63.13 8504M and with -O2 -g: phase opt and generate : 297.82 ( 69%) 593M ( 6%) callgraph construction : 292.23 ( 67%) 444M ( 5%) template instantiation : 14.54 ( 3%) 2083M ( 22%) constant expression evaluation : 32.20 ( 7%) 5799M ( 61%) symout : 85.34 ( 20%) 535M ( 6%) TOTAL : 434.68 9485M to me the increased -g time is simply debug info generation (we have no timevar for that, symout captures some of it), possibly because of very many templates that are being instantiated? Interestingly we have again (I've seen this elsewhere, PR114563): Samples: 1M of event 'cycles:P', Event count (approx.): 1867236850006 Overhead Samples Command Shared Object Symbol 85.81% 1500713 cc1plus cc1plus [.] ggc_internal_alloc(un 1.55% 28292 cc1plus cc1plus [.] cxx_eval_constant_exp 1.44% 25472 cc1plus cc1plus [.] find_substitution(tre timevar coverage can be improved by putting early debug generation under TV_SYMOUT as well: diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc index 82f205488e9..fa54a59d02b 100644 --- a/gcc/cgraphunit.cc +++ b/gcc/cgraphunit.cc @@ -2588,6 +2588,8 @@ symbol_table::finalize_compilation_unit (void) if (!seen_error ()) { + timevar_push (TV_SYMOUT); + /* Give the frontends the chance to emit early debug based on what is still reachable in the TU. */ (*lang_hooks.finalize_early_debug) (); @@ -2597,6 +2599,8 @@ symbol_table::finalize_compilation_unit (void) debuginfo_early_start (); (*debug_hooks->early_finish) (main_input_filename); debuginfo_early_stop (); + + timevar_pop (TV_SYMOUT); } /* Finally drive the pass manager. */