On 06/01/21 14:34, Vincent Guittot wrote:
> Setting LBF_ALL_PINNED during active load balance is only valid when there
> is only 1 running task on the rq otherwise this ends up increasing the
> balance interval whereas other tasks could migrate after the next interval
> once they become cache-cold as an example.
>
> Signed-off-by: Vincent Guittot <[email protected]>
> ---
>  kernel/sched/fair.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5428b8723e61..69a455113b10 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
>                       if (!cpumask_test_cpu(this_cpu, 
> busiest->curr->cpus_ptr)) {
>                               raw_spin_unlock_irqrestore(&busiest->lock,
>                                                           flags);
> -                             env.flags |= LBF_ALL_PINNED;
> +                             if (busiest->nr_running == 1)
> +                                     env.flags |= LBF_ALL_PINNED;

So LBF_ALL_PINNED *can* be set if busiest->nr_running > 1, because
before we get there we have:

  if (nr_running > 1) {
      env.flags |= LBF_ALL_PINNED;
      detach_tasks(&env); // Removes LBF_ALL_PINNED if > 0 tasks can be pulled
      ...
  }

What about following the logic used by detach_tasks() and only clear the
flag? Say something like the below. if nr_running > 1, then we'll have
gone through detach_tasks() and will have cleared the flag (if
possible).
---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..211c86ba3f5b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9623,6 +9623,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
        env.src_rq = busiest;
 
        ld_moved = 0;
+       /* Clear this as soon as we find a single pullable task */
+       env.flags |= LBF_ALL_PINNED;
        if (busiest->nr_running > 1) {
                /*
                 * Attempt to move tasks. If find_busiest_group has found
@@ -9630,7 +9632,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
                 * still unbalanced. ld_moved simply stays zero, so it is
                 * correctly treated as an imbalance.
                 */
-               env.flags |= LBF_ALL_PINNED;
                env.loop_max  = min(sysctl_sched_nr_migrate, 
busiest->nr_running);
 
 more_balance:
@@ -9756,10 +9757,11 @@ static int load_balance(int this_cpu, struct rq 
*this_rq,
                        if (!cpumask_test_cpu(this_cpu, 
busiest->curr->cpus_ptr)) {
                                raw_spin_unlock_irqrestore(&busiest->lock,
                                                            flags);
-                               env.flags |= LBF_ALL_PINNED;
                                goto out_one_pinned;
                        }
 
+                       env.flags &= ~LBF_ALL_PINNED;
+
                        /*
                         * ->active_balance synchronizes accesses to
                         * ->active_balance_work.  Once set, it's cleared
---

Reply via email to