On Thu, 6 Aug 2020 12:25:08 -0700 Eric Dumazet wrote: > On 8/6/20 11:55 AM, Jakub Kicinski wrote: > > I'm still trying to wrap my head around this. > > > > Am I understanding correctly that you have one IRQ and multiple NAPI > > instances? > > > > Are we not going to end up with pretty terrible cache locality here if > > the scheduler starts to throw rx and tx completions around to random > > CPUs? > > > > I understand that implementing separate kthreads would be more LoC, but > > we do have ksoftirqs already... maybe we should make the NAPI -> > > ksoftirq mapping more flexible, and improve the logic which decides to > > load ksoftirq rather than make $current() pay? > > > > Sorry for being slow. > > Issue with ksoftirqd is that > - it is bound to a cpu
Do you envision the scheduler balancing or work stealing being advantageous in some configurations? I was guessing that for compute workloads having ksoftirq bound will actually make things more predictable/stable. For pure routers (where we expect multiple cores to reach 100% just doing packet forwarding) as long as there is an API to re-balance NAPIs to cores - a simple specialized user space daemon would probably do a better job as it can consult packet drop metrics etc. Obviously I have no data to back up these claims.. > - Its nice value is 0, meaning that user threads can sometime compete too > much with it. True, I thought we could assume user level tuning. > - It handles all kinds of softirqs, so messing with it might hurt some other > layer. Right, I have no data on how much this hurts in practice. > Note that the patch is using a dedicate work queue. It is going to be not > practical > in case you need to handle two different NIC, and want separate pools for > each of them. > > Ideally, having one kthread per queue would be nice, but then there is more > plumbing > work to let these kthreads being visible in a convenient way > (/sys/class/net/ethX/queues/..../kthread) Is context switching cost negligible? ksoftirq-like thread replicates all the NAPI budget-level mixing we already do today.