tg_set_cpu_limit() calls __tg_set_cfs_bandwidth(), which iterates over
for_each_online_cpu(i)and takes per-CPU rq locks. However,
tg_set_cpu_limit() does not hold cpus_read_lock().

The requirement to hold cpus_read_lock() was introduced by the upstream
commit
0e59bdaea75f ("sched/fair: Disable runtime_enabled on dying rq"),
which changed the iteration in __tg_set_cfs_bandwidth() from
for_each_possible_cpu to for_each_online_cpu and added
get_online_cpus()/put_online_cpus() around the call. This was done to
prevent a race  between setting cfs_rq->runtime_enabled and
unthrottle_offline_cfs_rqs().

If a CPU goes offline while __tg_set_cfs_bandwidth() is executing inside
tg_set_cpu_limit(), the function may re-enable runtime_enabled on a
dying CPU's cfs_rq after unthrottle_offline_cfs_rqs() has already
cleared it, leaving tasks stranded on a dead CPU with no way to
migrate.

The bug was inherited from the original commit
4514c5835d32f ("sched: Port CONFIG_CFS_CPULIMIT feature"),
where tg_set_cpu_limit() was ported from vz7 (kernel 3.10) without
accounting for the changed locking requirements. In the vz7 kernel,
__tg_set_cfs_bandwidth() used for_each_possible_cpu, so cpus_read_lock()
was not needed.

Fixes: 4514c5835d32f ("sched: Port CONFIG_CFS_CPULIMIT feature")

https://virtuozzo.atlassian.net/browse/VSTOR-127251

Signed-off-by: Dmitry Sepp <[email protected]>
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0423c1b323ca..36cef7e6bfeb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10030,6 +10030,7 @@ static int tg_set_cpu_limit(struct task_group *tg, 
unsigned int nr_cpus)
                quota = max(quota, min_cfs_quota_period);
        }
 
+       cpus_read_lock();
        mutex_lock(&cfs_constraints_mutex);
        ret = __tg_set_cfs_bandwidth(tg, period, quota, burst);
        if (!ret) {
@@ -10037,6 +10038,7 @@ static int tg_set_cpu_limit(struct task_group *tg, 
unsigned int nr_cpus)
                tg->nr_cpus = nr_cpus;
        }
        mutex_unlock(&cfs_constraints_mutex);
+       cpus_read_unlock();
 
        return ret;
 }
-- 
2.47.1

_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to