On 3/20/26 10:21, Dmitry Sepp wrote: > ________________________________________ > From: Pavel Tikhomirov <[email protected]> > Sent: Thursday, 19 March 2026 14:28 > To: Konstantin Khorenko <[email protected]>; Dmitry Sepp > <[email protected]> > Cc: [email protected] <[email protected]> > Subject: Re: [PATCH vz10 v4 1/3] sched: Clean up vCPU handling logic > > > > On 3/19/26 14:12, Konstantin Khorenko wrote: >> On 3/19/26 10:47, Dmitry Sepp wrote: >>> The idea behind the change is to transition from the existing spatial >>> vCPU handling approach that introduces costly modification to the >>> scheduling logic to ensure the requested CPU count is obeyed (10%+ >>> performance drop in some tests) to temporal isolation that can be >>> provided by the cgroup2 cpu.max. >>> >>> Drop the legacy unneeded vCPU handling code. Remove the 'cpu.rate' >>> control in favor of the internal calculation based on 'quota' and >>> 'period' from 'cpu.max'. As 'cpu.max' is not implicitly used to set the >>> rate, do not override nr_cpus when handling writes to 'cpu.max'. >>> >>> https://virtuozzo.atlassian.net/browse/VSTOR-124385 >>> >>> Signed-off-by: Dmitry Sepp <[email protected]> >>> --- >>> include/linux/sched.h | 6 - >>> include/linux/sched/topology.h | 5 - >>> kernel/sched/core.c | 98 +------- >>> kernel/sched/fair.c | 408 --------------------------------- >>> kernel/sched/sched.h | 10 - >>> 5 files changed, 12 insertions(+), 515 deletions(-) >>> >> ... >> >>> -static int tg_set_cpu_limit(struct task_group *tg, >>> - unsigned long cpu_rate, unsigned int nr_cpus) >>> +static int tg_set_cpu_limit(struct task_group *tg, unsigned int nr_cpus) >>> { >>> int ret; >>> unsigned long rate; >>> + unsigned long cpu_rate = tg->cpu_rate; >>> u64 quota = RUNTIME_INF; >>> u64 burst = tg_get_cfs_burst(tg); >>> u64 period = default_cfs_period(); >>> @@ -10090,21 +10041,6 @@ static int tg_set_cpu_limit(struct task_group *tg, >>> return ret; >>> } >>> >> >> static int tg_set_cpu_limit(struct task_group *tg, unsigned int nr_cpus) >> { >> int ret; >> unsigned long rate; >> unsigned long cpu_rate = tg->cpu_rate; >> u64 quota = RUNTIME_INF; >> u64 burst = tg_get_cfs_burst(tg); >> u64 period = default_cfs_period(); >> >> rate = (cpu_rate && nr_cpus) ? >> min_t(unsigned long, cpu_rate, nr_cpus * MAX_CPU_RATE) : >> max_t(unsigned long, cpu_rate, nr_cpus * MAX_CPU_RATE); >> if (rate) { >> quota = div_u64(period * rate, MAX_CPU_RATE); >> quota = max(quota, min_cfs_quota_period); >> } >> >> cpus_read_lock(); >> mutex_lock(&cfs_constraints_mutex); >> ret = __tg_set_cfs_bandwidth(tg, period, quota, burst); >> if (!ret) { >> tg->cpu_rate = cpu_rate; >> // the line above is a no-op. >> // i will apply the patch as is but in case this was not intended, please >> fix it with an incremental patch > > We use ->cpu_rate in cpu_cgroup_update_vcpustat, should not it be updated to > the value of rate here? > > I reused the logic of the original code here. It seems rate was only intended > to calculate quota. The following would apply to both nr_cpus and cpu_rate: > if we set nr_cpus, tg->cpu_rate would be passed as the rate parameter and > would not be changed. And other way round, if the rate control was written > the nr_cpu argument was merely tg->nr_cpu, which would not be changed while > changing rate. As we now don't provide the rate control there is no way to > change rate via tg_set_cpu_limit(), only through cpu.max -> > tg_update_cpu_limit().
My worry was that this "no-op" is an indication of mistake in original logic. But now when I look on cpu_cgroup_update_vcpustat() more closely, it gracefully handles the case of unset ->cpu_rate, so all is fine. Current implementation is correct. Thanks! > >> >> >> tg->nr_cpus = nr_cpus; >> } >> mutex_unlock(&cfs_constraints_mutex); >> cpus_read_unlock(); >> >> return ret; >> } > > -- > Best regards, Pavel Tikhomirov > Senior Software Developer, Virtuozzo. -- Best regards, Pavel Tikhomirov Senior Software Developer, Virtuozzo. _______________________________________________ Devel mailing list [email protected] https://lists.openvz.org/mailman/listinfo/devel
