On Mon, Oct 7, 2024 at 6:27 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Sat, Oct 5, 2024 at 10:17 AM Andi Kleen <a...@linux.intel.com> wrote: > > > > From: Andi Kleen <a...@gcc.gnu.org> > > > > Time vars normally use times(2) to get the user/sys/wall time, which is > > always a > > system call. I don't think the system time is very useful because most > > overhead > > is in user time. If we only use the wall (or monotonic) time modern OS have > > an > > optimized path to get it directly from a CPU instruction like RDTSC > > without system call, which is much faster. > > > > Add a -ftime-report-wall option. It actually uses the POSIX monotonic time, > > so strictly it's not wall clock, but it's still a reasonable name. > > > > Comparing the overhead with tramp3d -O0: > > > > ./gcc/cc1plus -quiet ../tsrc/tramp3d-v4.i ran > > 1.03 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report-wall > > ../tsrc/tramp3d-v4.i > > 1.18 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report > > ../tsrc/tramp3d-v4.i > > > > -ftime-report costs 18% (excluding the output), while -ftime-report-wall > > only costs 3%, so is nearly free. So it would be feasible for some build > > system to always enable it and break down the build time into passes. > > > > With -O2 it is a bit less pronounced but still visible: > > > > ./gcc/cc1plus -O2 -quiet ../tsrc/tramp3d-v4.i ran > > 1.00 ± 0.00 times faster than ./gcc/cc1plus -O2 -quiet > > -ftime-report-wall ../tsrc/tramp3d-v4.i > > 1.08 ± 0.01 times faster than ./gcc/cc1plus -O2 -quiet -ftime-report > > ../tsrc/tramp3d-v4.i > > > > The drawback is that if there is context switching with other programs > > the time will be overestimated, however for the common case that the > > system is not oversubscribed it is more accurate because each > > measurement has less overhead. > > > > Bootstrapped on x86_64-linux with full test suite run. > > Thanks for doing this - I'd like to open up for discussion whether we > should simply > switch the default and stop recording user/system time for > -ftime-report. One reason > some infrastructure isn't using fine-grained timevars is because of overhead. > > So, shouldn't we go without the new option and simply change > -ftime-report behavior?
Personally I'd prefer the original approach of adding a separate -ftime-report-wall flag, instead of changing the output of the existing -ftime-report flag... > > Related - with -ftime-trace coming up again recently I wonder if we > should transition > to -ftime-report={user,wall,details,trace,...} allowing > -ftime-report=user,details. I'll note > that while adding an option, removing it later is always difficult. > > Richard. > > > gcc/ChangeLog: > > > > * common.opt (ftime-report-wall): Add. > > * common.opt.urls: Regenerate. > > * doc/invoke.texi: (ftime-report-wall): Document > > * gcc.cc (try_generate_repro): Check for -ftime-report-wall. > > * timevar.cc (get_time): Use clock_gettime if enabled. > > (timer::print): Print only wall time for time_report_wall. > > (make_json_for_timevar_time_def): Dito. > > * toplev.cc (toplev::start_timevars): Check for time_report_wall. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/ext/timevar3.C: New test. > > > > --- > > > > v2: Adjust JSON/Sarif output too. > > --- > > gcc/common.opt | 4 +++ > > gcc/common.opt.urls | 3 +++ > > gcc/doc/invoke.texi | 7 ++++++ > > gcc/gcc.cc | 3 ++- > > gcc/testsuite/g++.dg/ext/timevar3.C | 14 +++++++++++ > > gcc/timevar.cc | 38 +++++++++++++++++++++++------ > > gcc/toplev.cc | 3 ++- > > 7 files changed, 62 insertions(+), 10 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/ext/timevar3.C > > > > diff --git a/gcc/common.opt b/gcc/common.opt > > index 12b25ff486de..a200a8a0bc45 100644 > > --- a/gcc/common.opt > > +++ b/gcc/common.opt > > @@ -3014,6 +3014,10 @@ ftime-report > > Common Var(time_report) > > Report the time taken by each compiler pass. > > > > +ftime-report-wall > > +Common Var(time_report_wall) > > +Report the wall time taken by each compiler. > > + > > ftime-report-details > > Common Var(time_report_details) > > Record times taken by sub-phases separately. > > diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls > > index e31736cd9945..6e79a8f9390b 100644 > > --- a/gcc/common.opt.urls > > +++ b/gcc/common.opt.urls > > @@ -1378,6 +1378,9 @@ > > UrlSuffix(gcc/Optimize-Options.html#index-fthread-jumps) > > ftime-report > > UrlSuffix(gcc/Developer-Options.html#index-ftime-report) > > > > +ftime-report-wall > > +UrlSuffix(gcc/Developer-Options.html#index-ftime-report-wall) > > + > > ftime-report-details > > UrlSuffix(gcc/Developer-Options.html#index-ftime-report-details) > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > index d38c1feb86f7..8c11d12e7521 100644 > > --- a/gcc/doc/invoke.texi > > +++ b/gcc/doc/invoke.texi > > @@ -784,6 +784,7 @@ Objective-C and Objective-C++ Dialects}. > > -frandom-seed=@var{string} -fsched-verbose=@var{n} > > -fsel-sched-verbose -fsel-sched-dump-cfg -fsel-sched-pipelining-verbose > > -fstats -fstack-usage -ftime-report -ftime-report-details > > +-ftime-report-wall > > -fvar-tracking-assignments-toggle -gtoggle > > -print-file-name=@var{library} -print-libgcc-file-name > > -print-multi-directory -print-multi-lib -print-multi-os-directory > > @@ -21048,6 +21049,12 @@ slightly different place within the compiler. > > @item -ftime-report-details > > Record the time consumed by infrastructure parts separately for each pass. > > > > +@opindex ftime-report-wall > > +@item -ftime-report-wall > > +Report statistics about compiler pass time consumpion, but only using wall > > +time. This is faster than @option{-ftime-report}, but can be more > > +influenced by background jobs. > > + > > @opindex fira-verbose > > @item -fira-verbose=@var{n} > > Control the verbosity of the dump file for the integrated register > > allocator. > > diff --git a/gcc/gcc.cc b/gcc/gcc.cc > > index 16fed46fb35f..8d3046eb7874 100644 > > --- a/gcc/gcc.cc > > +++ b/gcc/gcc.cc > > @@ -7964,7 +7964,8 @@ try_generate_repro (const char **argv) > > it might varry between invocations. */ > > else if (! strcmp (argv[nargs], "-quiet")) > > quiet = 1; > > - else if (! strcmp (argv[nargs], "-ftime-report")) > > + else if (! strcmp (argv[nargs], "-ftime-report") > > + || ! strcmp (argv[nargs], "-ftime-report-wall")) > > return; > > > > if (out_arg == -1 || !quiet) > > diff --git a/gcc/testsuite/g++.dg/ext/timevar3.C > > b/gcc/testsuite/g++.dg/ext/timevar3.C > > new file mode 100644 > > index 000000000000..b003f37f9654 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/ext/timevar3.C > > @@ -0,0 +1,14 @@ > > +// PR c++/52248 > > +// { dg-options "-ftime-report-wall" } > > +// { dg-allow-blank-lines-in-output 1 } > > +// { dg-prune-output "Time variable" } > > +// { dg-prune-output "k" } > > +// { dg-prune-output " 0 " } > > +// { dg-prune-output "checks" } > > +// { dg-prune-output "\[0-9\]+%" } > > + > > +void > > +foo () > > +{ > > + goto lab; // { dg-error "not defined" } > > +} > > diff --git a/gcc/timevar.cc b/gcc/timevar.cc > > index 68bcf44864f9..4930b1175a13 100644 > > --- a/gcc/timevar.cc > > +++ b/gcc/timevar.cc > > @@ -243,6 +243,15 @@ get_time (struct timevar_time_def *now) > > now->ggc_mem = timevar_ggc_mem_total; > > > > { > > +#if _POSIX_C_SOURCE >= 199309L > > + if (time_report_wall) > > + { > > + struct timespec ts; > > + clock_gettime (CLOCK_MONOTONIC, &ts); > > + now->wall = ts.tv_sec * 1000000000 + ts.tv_nsec; > > + return; > > + } > > +#endif > > #ifdef USE_TIMES > > struct tms tms; > > now->wall = times (&tms) * ticks_to_nanosec; > > @@ -682,14 +691,16 @@ timer::print_row (FILE *fp, > > > > #ifdef HAVE_USER_TIME > > /* Print user-mode time for this process. */ > > - fprintf (fp, "%7.2f (%3.0f%%)", > > + if (!time_report_wall) > > + fprintf (fp, "%7.2f (%3.0f%%)", > > nanosec_to_floating_sec (elapsed.user), > > percent_of (total->user, elapsed.user)); > > #endif /* HAVE_USER_TIME */ > > > > #ifdef HAVE_SYS_TIME > > + if (!time_report_wall) > > /* Print system-mode time for this process. */ > > - fprintf (fp, "%7.2f (%3.0f%%)", > > + fprintf (fp, "%7.2f (%3.0f%%)", > > nanosec_to_floating_sec (elapsed.sys), > > percent_of (total->sys, elapsed.sys)); > > #endif /* HAVE_SYS_TIME */ > > @@ -754,7 +765,10 @@ timer::print (FILE *fp) > > TIMEVAR. */ > > m_start_time = now; > > > > - fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", "sys", > > + if (time_report_wall) > > + fprintf (fp, "\n%-35s%16s%14s\n", "Time variable", "wall", "GGC"); > > + else > > + fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", > > "sys", > > "wall", "GGC"); > > if (m_jit_client_items) > > fputs ("GCC items:\n", fp); > > @@ -810,13 +824,18 @@ timer::print (FILE *fp) > > /* Print total time. */ > > fprintf (fp, " %-35s:", "TOTAL"); > > #ifdef HAVE_USER_TIME > > - fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user)); > > + if (!time_report_wall) > > + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user)); > > #endif > > #ifdef HAVE_SYS_TIME > > - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); > > + if (!time_report_wall) > > + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); > > #endif > > #ifdef HAVE_WALL_TIME > > - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall)); > > + if (!time_report_wall) > > + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall)); > > + else > > + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->wall)); > > #endif > > fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem)); > > > > @@ -842,8 +861,11 @@ json::object * > > make_json_for_timevar_time_def (const timevar_time_def &ttd) > > { > > json::object *obj = new json::object (); > > - obj->set_float ("user", nanosec_to_floating_sec (ttd.user)); > > - obj->set_float ("sys", nanosec_to_floating_sec (ttd.sys)); > > + if (!time_report_wall) > > + { > > + obj->set_float ("user", nanosec_to_floating_sec (ttd.user)); > > + obj->set_float ("sys", nanosec_to_floating_sec (ttd.sys)); > > + } > > obj->set_float ("wall", nanosec_to_floating_sec (ttd.wall)); > > obj->set_integer ("ggc_mem", ttd.ggc_mem); > > return obj; > > diff --git a/gcc/toplev.cc b/gcc/toplev.cc > > index 5df59b79c803..87fa3f4c8833 100644 > > --- a/gcc/toplev.cc > > +++ b/gcc/toplev.cc > > @@ -2247,7 +2247,8 @@ toplev::~toplev () > > void > > toplev::start_timevars () > > { > > - if (time_report || !quiet_flag || flag_detailed_statistics) > > + if (time_report || time_report_wall || !quiet_flag > > + || flag_detailed_statistics) > > timevar_init (); > > > > timevar_start (TV_TOTAL); > > -- > > 2.46.2 > >