On Thu, May 24, 2018 at 04:45:41PM +0200, Jesper Dangaard Brouer wrote: > This patchset change ndo_xdp_xmit API to take a bulk of xdp frames. > > When kernel is compiled with CONFIG_RETPOLINE, every indirect function > pointer (branch) call hurts performance. For XDP this have a huge > negative performance impact. > > This patchset reduce the needed (indirect) calls to ndo_xdp_xmit, but > also prepares for further optimizations. The DMA APIs use of indirect > function pointer calls is the primary source the regression. It is > left for a followup patchset, to use bulking calls towards the DMA API > (via the scatter-gatter calls). > > The other advantage of this API change is that drivers can easier > amortize the cost of any sync/locking scheme, over the bulk of > packets. The assumption of the current API is that the driver > implemementing the NDO will also allocate a dedicated XDP TX queue for > every CPU in the system. Which is not always possible or practical to > configure. E.g. ixgbe cannot load an XDP program on a machine with > more than 96 CPUs, due to limited hardware TX queues. E.g. virtio_net > is hard to configure as it requires manually increasing the > queues. E.g. tun driver chooses to use a per XDP frame producer lock > modulo smp_processor_id over avail queues. > > I'm considered adding 'flags' to ndo_xdp_xmit, but it's not part of > this patchset. This will be a followup patchset, once we know if this > will be needed (e.g. for non-map xdp_redirect flush-flag, and if > AF_XDP chooses to use ndo_xdp_xmit for TX). > > --- > V5: Fixed up issues spotted by Daniel and John > > V4: Splitout the patches from 4 to 8 patches. I cannot split the > driver changes from the NDO change, but I've tried to isolated the NDO > change together with the driver change as much as possible.
The patch 6/8 would have benefited from a review from Intel folks, but the series have been pending for too long already, hence Applied to bpf-next. Please address any follow up reviews if/when they come. Thanks Jesper.