Hi,

On 29/10/25 20:08, Andrea Righi wrote:
> sched_ext currently suffers starvation due to RT. The same workload when
> converted to EXT can get zero runtime if RT is 100% running, causing EXT
> processes to stall. Fix it by adding a DL server for EXT.
> 
> A kselftest is also provided later to verify:
> 
>  # ./runner -t rt_stall
>  ===== START =====
>  TEST: rt_stall
>  DESCRIPTION: Verify that RT tasks cannot stall SCHED_EXT tasks
>  OUTPUT:
>  # Runtime of EXT task (PID 23338) is 0.250000 seconds
>  # Runtime of RT task (PID 23339) is 4.750000 seconds
>  # EXT task got 5.00% of total runtime
>  ok 1 PASS: EXT task got more than 4.00% of runtime
>  =====  END  =====
> 
> v3: - clarify that fair is not the only dl_server (Juri Lelli)
>     - remove explicit stop to reduce timer reprogramming overhead
>       (Juri Lelli)
>     - do not restart pick_task() when it's invoked by the dl_server
>       (Tejun Heo)
>     - depend on CONFIG_SCHED_CLASS_EXT (Andrea Righi)
> v2: - drop ->balance() now that pick_task() has an rf argument
>       (Andrea Righi)
> 
> Cc: Luigi De Matteis <[email protected]>
> Co-developed-by: Joel Fernandes <[email protected]>
> Signed-off-by: Joel Fernandes <[email protected]>
> Signed-off-by: Andrea Righi <[email protected]>
> ---

...

> @@ -1409,6 +1412,15 @@ static void enqueue_task_scx(struct rq *rq, struct 
> task_struct *p, int enq_flags
>       if (enq_flags & SCX_ENQ_WAKEUP)
>               touch_core_sched(rq, p);
>  
> +     if (rq->scx.nr_running == 1) {
> +             /* Account for idle runtime */
> +             if (!rq->nr_running)

Hummm, didn't we just add_nr_running(rq, 1) before gettng here?

> +                     dl_server_update_idle_time(rq, rq->curr, 
> &rq->ext_server);
> +
> +             /* Start dl_server if this is the first task being enqueued */
> +             dl_server_start(&rq->ext_server);
> +     }
> +

Thanks,
Juri


Reply via email to