Hi Akim, > With respect to wall clock: > > - nothing in getrusage. You suggest gettimeofday, but it’s > not monotonic, clock_gettime(CLOCK_MONOTONIC,…) is a better > option.
OK, point taken. > With respect to precision: > - times returns a clock_t, in system dependent clock ticks. > Resolution is unclear, but I think I understood from various > place that it’s typically 1µs. No, you can't assume that. You need to measure it. Here's what I get on today's mainstream architecture (Linux 4.x, x86_64 CPU from Intel): $ ./measure 1000 tms_utime: 0.000000 monotonic: 0.000015 $ ./measure 10000 tms_utime: 0.000000 monotonic: 0.000064 $ ./measure 100000 tms_utime: 0.000000 monotonic: 0.001127 $ ./measure 1000000 tms_utime: 0.010000 monotonic: 0.011327 $ ./measure 3000000 tms_utime: 0.020000 monotonic: 0.023167 $ ./measure 10000000 tms_utime: 0.050000 monotonic: 0.049540 So, very obviously, the resolution of times() is 20 milliseconds (!). > - getrusage: 1µs on modern platforms, possibly 1ms > (https://stackoverflow.com/a/12480485/1353549) 1µs on modern platforms? No. Again, you need to measure it. $ ./measure2 1000 tms_utime: 0.000000 monotonic: 0.000021 resources: 0.000000 $ ./measure2 10000 tms_utime: 0.000000 monotonic: 0.000131 resources: 0.000000 $ ./measure2 100000 tms_utime: 0.000000 monotonic: 0.000573 resources: 0.000000 $ ./measure2 1000000 tms_utime: 0.010000 monotonic: 0.011352 resources: 0.012000 $ ./measure2 3000000 tms_utime: 0.030000 monotonic: 0.036984 resources: 0.032000 $ ./measure2 10000000 tms_utime: 0.040000 monotonic: 0.040968 resources: 0.040000 $ ./measure2 500000 tms_utime: 0.000000 monotonic: 0.001559 resources: 0.000000 $ ./measure2 700000 tms_utime: 0.000000 monotonic: 0.008029 resources: 0.008000 $ ./measure2 600000 tms_utime: 0.000000 monotonic: 0.007116 resources: 0.008000 $ ./measure2 1100000 tms_utime: 0.010000 monotonic: 0.012702 resources: 0.012000 $ ./measure2 500000 tms_utime: 0.000000 monotonic: 0.005821 resources: 0.004000 So, the resolution of getrusage() is 4 ms. Slightly better than times(), bit still orders of magnitude worse than CLOCK_MONOTONIC. > With respect to what you called multiplicity (i.e., time spent > in subprocesses or threads): > > - getrusage offers two choices: > - RUSAGE_SELF: the calling process and its threads. > - RUSAGE_CHILDREN: main and subprocesses that have been wait’ed. > - Linux offers RUSAGE_THREAD, which is just about the current > thread. I don’t think we care about this. > (https://linux.die.net/man/2/getrusage) Good. Do you want 'timevar' to support RUSAGE_SELF and RUSAGE_CHILDREN, at the user's choice, or do you want to pick one for the user? In summary, I think the user needs to have the choice: - For low-precision timings, offer getrusage with RUSAGE_CHILDREN or optionally RUSAGE_SELF as an alternative. - For high-precision timings, offer CLOCK_MONOTONIC and don't call neither getrusage nor times (because the point of high-precision timings is to have minimal system call overhead). And this choice needs to be made at runtime. It would be silly to have to recompile a program just to measure longer vs. shorter tasks. Bruno
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/times.h> #include <time.h> int main (int argc, char **argv) { int repeat = atoi (argv[1]); struct tms tms_before; struct tms tms_after; struct timespec monotonic_before; struct timespec monotonic_after; times (&tms_before); clock_gettime (CLOCK_MONOTONIC, &monotonic_before); for (int i = repeat; i > 0; i--) ; times (&tms_after); clock_gettime (CLOCK_MONOTONIC, &monotonic_after); printf ("tms_utime: %.6f\n", (double)(tms_after.tms_utime - tms_before.tms_utime) * 1.0 / sysconf (_SC_CLK_TCK)); printf ("monotonic: %.6f\n", (double)(((long long)monotonic_after.tv_sec - (long long)monotonic_before.tv_sec) * 1000000000LL + monotonic_after.tv_nsec - monotonic_before.tv_nsec) * 0.000000001); } /* * Local Variables: * compile-command: "gcc -Wall -o measure measure.c -lrt" * End: */
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/times.h> #include <sys/resource.h> #include <time.h> int main (int argc, char **argv) { int repeat = atoi (argv[1]); struct tms tms_before; struct tms tms_after; struct timespec monotonic_before; struct timespec monotonic_after; struct rusage resources_before; struct rusage resources_after; times (&tms_before); clock_gettime (CLOCK_MONOTONIC, &monotonic_before); getrusage (RUSAGE_SELF, &resources_before); for (int i = repeat; i > 0; i--) ; times (&tms_after); clock_gettime (CLOCK_MONOTONIC, &monotonic_after); getrusage (RUSAGE_SELF, &resources_after); printf ("tms_utime: %.6f\n", (double)(tms_after.tms_utime - tms_before.tms_utime) * 1.0 / sysconf (_SC_CLK_TCK)); printf ("monotonic: %.6f\n", (double)(((long long)monotonic_after.tv_sec - (long long)monotonic_before.tv_sec) * 1000000000LL + monotonic_after.tv_nsec - monotonic_before.tv_nsec) * 0.000000001); printf ("resources: %.6f\n", (double)(((long long)resources_after.ru_utime.tv_sec - (long long)resources_before.ru_utime.tv_sec) * 1000000LL + resources_after.ru_utime.tv_usec - resources_before.ru_utime.tv_usec) * 0.000001); } /* * Local Variables: * compile-command: "gcc -Wall -o measure2 measure2.c -lrt" * End: */