On Fri, 30 Oct 2020 at 03:11, Rik van Riel <[email protected]> wrote: > > On Mon, 2020-10-26 at 17:52 +0100, Vincent Guittot wrote: > > On Mon, 26 Oct 2020 at 17:48, Chris Mason <[email protected]> wrote: > > > On 26 Oct 2020, at 12:20, Vincent Guittot wrote: > > > > > > > what you are suggesting is something like: > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > > index 4978964e75e5..3b6fbf33abc2 100644 > > > > --- a/kernel/sched/fair.c > > > > +++ b/kernel/sched/fair.c > > > > @@ -9156,7 +9156,8 @@ static inline void > > > > calculate_imbalance(struct > > > > lb_env *env, struct sd_lb_stats *s > > > > * emptying busiest. > > > > */ > > > > if (local->group_type == group_has_spare) { > > > > - if (busiest->group_type > group_fully_busy) { > > > > + if ((busiest->group_type > group_fully_busy) && > > > > + !(env->sd->flags & SD_SHARE_PKG_RESOURCES)) { > > > > /* > > > > * If busiest is overloaded, try to fill > > > > spare > > > > * capacity. This might end up creating > > > > spare > > > > capacity > > > > > > > > which also fixes the problem for me and alignes LB with wakeup > > > > path > > > > regarding the migration > > > > in the LLC > > > > > > Vincent’s patch on top of 5.10-rc1 looks pretty great: > > > > > > Latency percentiles (usec) runtime 90 (s) (3320 total samples) > > > 50.0th: 161 (1687 samples) > > > 75.0th: 200 (817 samples) > > > 90.0th: 228 (488 samples) > > > 95.0th: 254 (164 samples) > > > *99.0th: 314 (131 samples) > > > 99.5th: 330 (17 samples) > > > 99.9th: 356 (13 samples) > > > min=29, max=358 > > > > > > Next we test in prod, which probably won’t have answers until > > > tomorrow. Thanks again Vincent! > > > > Great ! > > > > I'm going to run more tests on my setup as well to make sure that it > > doesn't generate unexpected side effects on other kinds of use cases. > > We have tested the patch with several pretty demanding > workloads for the past several days, and it seems to > do the trick! > > With all the current scheduler code from the Linus tree, > plus this patch on top, performance is as good as it ever > was before with one workload, and slightly better with > the other.
Thanks for the test results. I still have a few tests to run on my systems but current results look good for me too. > > -- > All Rights Reversed.

