On Thu, Jan 17, 2019 at 8:43 PM Michael S. Tsirkin <m...@redhat.com> wrote: > > On Thu, Jan 17, 2019 at 08:08:53PM -0500, Willem de Bruijn wrote: > > From: Willem de Bruijn <will...@google.com> > > > > On multiqueue network devices, RPS maps are configured independently > > for each receive queue through /sys/class/net/$DEV/queues/rx-*. > > > > On virtio-net currently all packets use the map from rx-0, because the > > real rx queue is not known at time of map lookup by get_rps_cpu. > > > > Call skb_record_rx_queue in the driver rx path to make lookup work. > > > > Recording the receive queue has ramifications beyond RPS, such as in > > sticky load balancing decisions for sockets (skb_tx_hash) and XPS. > > > > Reported-by: Mark Hlady <mhl...@google.com> > > Signed-off-by: Willem de Bruijn <will...@google.com> > > And any examples how to see the benefit of this?
When there are fewer queues than cpus and rps is used to spread load across all cpus, it can be preferable to setup disjoint sets, such that each cpu handling an rxq interrupt spreads to an exclusive set of neighbors instead of having all interrupt handling cores contend on all other cores' softnet_data. More subtly, even if the policy is to spread uniformly, it can be preferable to set the RPS map to all cores except the core that handled the interrupt, as it already had to do some work in the initial receive path. It is also simply expected behavior for network devices to be able to configure rxq rps maps individually, so the current silent fallback to rx0 is confusing, especially since rx-1/rps_cpus, .. rx-n/rps_cpus files do exist and can be configured.