On 3/13/26 20:35, Konstantin Khorenko wrote:
> Use list_del_init() instead of list_del() when removing
> se->cfs_rq_node in account_entity_dequeue(). This mirrors
> the existing pattern used for se->group_node on the line above.
This comparison with se->group_node is incorrect. We use list_del_init() on
se->group_node for a good reason, as se->group_node is accessed from se outside
of list walk, where se->cfs_rq_node is only accessed through cfs_rq->tasks list
walk, thus we know for sure that se->fs_rq_node is always initialized when we
access it.
>
> list_del() poisons the prev/next pointers with LIST_POISON values.
> If the sched_entity is later accessed after the cfs_rq is freed
> (e.g. due to a stale timer or other use-after-free scenario), the
> poisoned pointers cause an immediate hard fault. While this is
> useful for debugging, it makes recovery impossible.
BUT, We don't access se->cfs_rq_node from timer handler.
>
> list_del_init() reinitializes the node to point to itself, so
> list_empty() checks on the freed node return true rather than
> dereferencing poisoned memory. This provides a safer default and
> makes the active_timer callback's list_empty(&cfs_rq->tasks)
> check return a benign result even in error scenarios.
>
> This is a defense-in-depth hardening complementary to the
> active_timer cancellation fix.
I think this patch is excess, if we have preexisting memory corruption we don't
really want to recover, we want to detect corruption. So if we somehow end up
seeing poisoned pointers in list we at least see a kernel warning about it,
this can help us debug the issue, instead of silently hiding the issue with
reinitialized list.
>
> https://virtuozzo.atlassian.net/browse/VSTOR-126785
>
> Signed-off-by: Konstantin Khorenko <[email protected]>
>
> Feature: sched: ability to limit number of CPUs available to a CT
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9b0fe4c8a272f..8ed4cfa0dc83e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3298,7 +3298,7 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct
> sched_entity *se)
> account_numa_dequeue(rq_of(cfs_rq), task_of(se));
> list_del_init(&se->group_node);
> #ifdef CONFIG_CFS_CPULIMIT
> - list_del(&se->cfs_rq_node);
> + list_del_init(&se->cfs_rq_node);
> #endif
> }
> #endif
--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.
_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel