On 02/25, Alexei Starovoitov wrote: > On 2/25/19 3:07 PM, Stanislav Fomichev wrote: > >> +#define BPF_PROG_RUN(prog, ctx) ({ \ > >> + u32 ret; \ > >> + cant_sleep(); \ > >> + if (static_branch_unlikely(&bpf_stats_enabled_key)) { \ > >> + struct bpf_prog_stats *stats; \ > >> + u64 start = sched_clock(); \ > > QQ: why sched_clock() and not, for example, ktime_get_ns() which we do > > in the bpf_test_run()? Or even why not local_clock? > > I'm just wondering what king of trade off we are doing here > > regarding precision vs run time cost. > > > I'm making this decision based on documentation: > Documentation/timers/timekeeping.txt > "Compared to clock sources, sched_clock() has to be very fast: it is > called much more often, especially by the scheduler. If you have to do > trade-offs between accuracy compared to the clock source, you may > sacrifice accuracy for speed in sched_clock()." So sched_clock is fast, but imprecise; and ktime_get_ns (and lock_clock?) are slow(er), but more precise?
If that's the case, would it make sense to use a more precise measurement? I suppose the BPF program execution time is on the order of nanoseconds and if sched_close has msec or usec resolution, all we get is essentially noise? I understand that you want this feature to have almost no overhead, but since it's gated by the static key, should we aim for a higher precision?