On Thu, 1 Oct 2020 18:44:40 -0700 Wei Wang wrote:
> > Can you share relative performance delta of this banchmark?
> >
> > Could you explain why threads are slower than ksoftirqd if you pin the
> > application away? From your cover letter it sounded like you want the
> > scheduler to see the NAPI load, but then you say you pinned the
> > application away from the NAPI cores for the test, so I'm confused.
> 
> No. We did not explicitly pin the application threads away.
> Application threads are free to run anywhere. What we do is we
> restrict the NAPI kthreads to only those CPUs handling rx interrupts.

Whatever. You pin the NAPI threads and hand-tune their number so the
load of the NAPI CPUs is always higher. If the workload changes the
system will get very unhappy.

> (For us, 8 cpus out of 56.) So the load on those CPUs are very high
> when running the test. And the scheduler is smart enough to avoid
> using those CPUs for the application threads automatically.
> Here is the results of 1 representative test result:
>                      cpu/op   50%tile     95%tile       99%tile
> base            71.47        417us      1.01ms          2.9ms
> kthread         67.84       396us      976us            2.4ms
> workqueue   69.68       386us      791us             1.9ms

Did you renice ksoftirqd in "base"?

> Actually, I remembered it wrong. It does seem workqueue is doing
> better on latencies. But cpu/op wise, kthread seems to be a bit
> better.

Q.E.D.

Reply via email to