On Tue, Jun 23, 2026 at 03:52:30PM +0200, Michal Koutný wrote: > On Mon, Jun 22, 2026 at 03:43:04PM -0400, Joe Simmons-Talbott > <[email protected]> wrote: > > +static long > > +_get_config_hz(void) > > +{ > > + long hz = -1; > > + FILE *f; > > + char cmd[256] = "zcat /proc/config.gz 2>/dev/null | grep '^CONFIG_HZ='"; > > + > > + f = popen(cmd, "r"); > > + > > + if (!f) > > + goto out; > > + > > + fscanf(f, "CONFIG_HZ=%ld", &hz); > > + > > +out: > > + pclose(f); > > + return hz; > > +} > > I like that you voiced this dependency on CONFIG_HZ and also that > _SC_CLK_TCK is useless in this regards. > (I see that BPF selftests have similar infra for this.) > > > > + > > /* > > * This test creates a cgroup with some maximum value within a period, and > > * verifies that a process in the cgroup is not overscheduled. > > @@ -646,7 +669,8 @@ test_cpucg_nested_weight_underprovisioned(const char > > *root) > > static int test_cpucg_max(const char *root) > > { > > int ret = KSFT_FAIL; > > - long quota_usec = 1000; > > + long hz = _get_config_hz(); > > + long quota_usec; > > long default_period_usec = 100000; /* cpu.max's default period */ > > long duration_seconds = 1; > > I would not bend the tested value but it's expectation (so that > approximately same quantity is tested acroos configs). > > I reckon the problem might be tasks that overrun the quota due to long > tick, fortunately, we can assume this is compensated over multiple > periods, so _on average_ quota should be honored (more) precisely. > But the test duration may be not well aligned with all the compensation > periods, to that must be accounted for in the expectation. > > When I write it all down, I get this: > > --- a/tools/testing/selftests/cgroup/test_cpu.c > +++ b/tools/testing/selftests/cgroup/test_cpu.c > @@ -651,7 +651,9 @@ static int test_cpucg_max(const char *root) > long duration_seconds = 1; > > long duration_usec = duration_seconds * USEC_PER_SEC; > - long usage_usec, n_periods, remainder_usec, expected_usage_usec; > + long usage_usec, expected_usage_usec; > + long n_periods, spread_periods, unaligned; > + long tick_usec, low_usage, high_usage; > char *cpucg; > char quota_buf[32]; > > @@ -687,9 +689,16 @@ static int test_cpucg_max(const char *root) > * the cpu hog is set to run as per wall-clock time > */ > n_periods = duration_usec / default_period_usec; > - remainder_usec = duration_usec - n_periods * default_period_usec; > - expected_usage_usec > - = n_periods * quota_usec + MIN(remainder_usec, quota_usec); > + tick_usec = USEC_PER_SEC / hz; > + /* Up to tick_usec (over)run is compensated over multiple periods */ > + spread_periods = MAX(1, tick_usec / quota_usec); > + low_usage = n_periods / spread_periods; > + high_usage = (n_periods + spread_periods - 1) / spread_periods; > + unaligned = n_periods % spread_periods; > + > + expected_usage_usec = quota_usec * ( > + unaligned * high_usage + > + (spread_periods - unaligned) * low_usage); > > if (!values_close_report(usage_usec, expected_usage_usec, 10)) > goto cleanup; > > > (I neglected (and dropped) remainder_usec because it is zero with > default values) > > However, not all preemptions are tick-based, so there'd be noise > and one has to tune the values_clone_report(,,err) anyway. > > Then to reduce noise, the simpler solution is to let the test run > longer > > duration_usec = duration_seconds * USEC_PER_SEC * 1000 / hz; > > (where 1000 is the CONFIG_HZ=1000 where the test runs sufficiently [1] well.) > > Joe, how do to the two variants above (unalignment account and prolonged > duration) affect test_cpu behavior on your setup?
Hi Michal, Thank you for your review. I tried both approaches, unalignment account and prolonged duration, and both allowed me to run 10 iterations of the test_cpu tests without any failures. I will use the simpler prolonged duration approach in v3 if that is okay. Thanks, Joe > > (I'm personally wondering what is bigger quantity: systemic error due to > HZ quantization or random (SMP) error.) > > Thanks, > Michal > > [1] Even there one runs into noise depending on nr_cpus, thus even that > fixed err=10 is not ideal.

