Hi Akim,

> With respect to wall clock:
> 
>   - nothing in getrusage.  You suggest gettimeofday, but it’s
>     not monotonic, clock_gettime(CLOCK_MONOTONIC,…) is a better
>     option.

OK, point taken.

> With respect to precision:

>   - times returns a clock_t, in system dependent clock ticks.
>     Resolution is unclear, but I think I understood from various
>     place that it’s typically 1µs.

No, you can't assume that. You need to measure it. Here's what I get
on today's mainstream architecture (Linux 4.x, x86_64 CPU from Intel):

$ ./measure 1000
tms_utime: 0.000000
monotonic: 0.000015
$ ./measure 10000
tms_utime: 0.000000
monotonic: 0.000064
$ ./measure 100000
tms_utime: 0.000000
monotonic: 0.001127
$ ./measure 1000000
tms_utime: 0.010000
monotonic: 0.011327
$ ./measure 3000000
tms_utime: 0.020000
monotonic: 0.023167
$ ./measure 10000000
tms_utime: 0.050000
monotonic: 0.049540

So, very obviously, the resolution of times() is 20 milliseconds (!).

> - getrusage: 1µs on modern platforms, possibly 1ms
>     (https://stackoverflow.com/a/12480485/1353549)

1µs on modern platforms? No. Again, you need to measure it.

$ ./measure2 1000
tms_utime: 0.000000
monotonic: 0.000021
resources: 0.000000
$ ./measure2 10000
tms_utime: 0.000000
monotonic: 0.000131
resources: 0.000000
$ ./measure2 100000
tms_utime: 0.000000
monotonic: 0.000573
resources: 0.000000
$ ./measure2 1000000
tms_utime: 0.010000
monotonic: 0.011352
resources: 0.012000
$ ./measure2 3000000
tms_utime: 0.030000
monotonic: 0.036984
resources: 0.032000
$ ./measure2 10000000
tms_utime: 0.040000
monotonic: 0.040968
resources: 0.040000
$ ./measure2 500000
tms_utime: 0.000000
monotonic: 0.001559
resources: 0.000000
$ ./measure2 700000
tms_utime: 0.000000
monotonic: 0.008029
resources: 0.008000
$ ./measure2 600000
tms_utime: 0.000000
monotonic: 0.007116
resources: 0.008000
$ ./measure2 1100000
tms_utime: 0.010000
monotonic: 0.012702
resources: 0.012000
$ ./measure2 500000
tms_utime: 0.000000
monotonic: 0.005821
resources: 0.004000

So, the resolution of getrusage() is 4 ms. Slightly better than times(),
bit still orders of magnitude worse than CLOCK_MONOTONIC.

> With respect to what you called multiplicity (i.e., time spent
> in subprocesses or threads):
> 
> - getrusage offers two choices:
>   - RUSAGE_SELF: the calling process and its threads.
>   - RUSAGE_CHILDREN: main and subprocesses that have been wait’ed.
>   - Linux offers RUSAGE_THREAD, which is just about the current
>     thread.  I don’t think we care about this.
>     (https://linux.die.net/man/2/getrusage)

Good. Do you want 'timevar' to support RUSAGE_SELF and RUSAGE_CHILDREN,
at the user's choice, or do you want to pick one for the user?

In summary, I think the user needs to have the choice:
  - For low-precision timings, offer getrusage with RUSAGE_CHILDREN or
    optionally RUSAGE_SELF as an alternative.
  - For high-precision timings, offer CLOCK_MONOTONIC and don't call
    neither getrusage nor times (because the point of high-precision
    timings is to have minimal system call overhead).

And this choice needs to be made at runtime. It would be silly to have
to recompile a program just to measure longer vs. shorter tasks.

Bruno
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/times.h>
#include <time.h>

int
main (int argc, char **argv)
{
  int repeat = atoi (argv[1]);

  struct tms tms_before;
  struct tms tms_after;
  struct timespec monotonic_before;
  struct timespec monotonic_after;

  times (&tms_before);
  clock_gettime (CLOCK_MONOTONIC, &monotonic_before);

  for (int i = repeat; i > 0; i--)
    ;

  times (&tms_after);
  clock_gettime (CLOCK_MONOTONIC, &monotonic_after);

  printf ("tms_utime: %.6f\n", (double)(tms_after.tms_utime - tms_before.tms_utime) * 1.0 / sysconf (_SC_CLK_TCK));
  printf ("monotonic: %.6f\n", (double)(((long long)monotonic_after.tv_sec - (long long)monotonic_before.tv_sec) * 1000000000LL + monotonic_after.tv_nsec - monotonic_before.tv_nsec) * 0.000000001);
}
  
/*
 * Local Variables:
 * compile-command: "gcc -Wall -o measure measure.c -lrt"
 * End:
 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/times.h>
#include <sys/resource.h>
#include <time.h>

int
main (int argc, char **argv)
{
  int repeat = atoi (argv[1]);

  struct tms tms_before;
  struct tms tms_after;
  struct timespec monotonic_before;
  struct timespec monotonic_after;
  struct rusage resources_before;
  struct rusage resources_after;

  times (&tms_before);
  clock_gettime (CLOCK_MONOTONIC, &monotonic_before);
  getrusage (RUSAGE_SELF, &resources_before);

  for (int i = repeat; i > 0; i--)
    ;

  times (&tms_after);
  clock_gettime (CLOCK_MONOTONIC, &monotonic_after);
  getrusage (RUSAGE_SELF, &resources_after);

  printf ("tms_utime: %.6f\n", (double)(tms_after.tms_utime - tms_before.tms_utime) * 1.0 / sysconf (_SC_CLK_TCK));
  printf ("monotonic: %.6f\n", (double)(((long long)monotonic_after.tv_sec - (long long)monotonic_before.tv_sec) * 1000000000LL + monotonic_after.tv_nsec - monotonic_before.tv_nsec) * 0.000000001);
  printf ("resources: %.6f\n", (double)(((long long)resources_after.ru_utime.tv_sec - (long long)resources_before.ru_utime.tv_sec) * 1000000LL + resources_after.ru_utime.tv_usec - resources_before.ru_utime.tv_usec) * 0.000001);
}
  
/*
 * Local Variables:
 * compile-command: "gcc -Wall -o measure2 measure2.c -lrt"
 * End:
 */

Reply via email to