From: Eric Dumazet <eduma...@google.com> Date: Wed, 31 Oct 2018 08:39:11 -0700
> While BQL bulk dequeue works well for TSO packets, it is > not very efficient as soon as GSO is involved. > > On a GSO only workload (UDP or TCP), this patch series > can save about 8 % of cpu cycles on a 40Gbit mlx4 NIC, > by keeping optimal batching, and avoiding expensive > doorbells, qdisc requeues and reschedules. > > This patch series : > > - Add __netdev_tx_sent_queue() so that drivers > can implement efficient BQL and xmit_more support. > > - Implement a work around in dev_hard_start_xmit() > for drivers not using __netdev_tx_sent_queue() > > - changes mlx4 to use __netdev_tx_sent_queue() > > v2: Tariq and Willem feedback addressed. > added __netdev_tx_sent_queue() (Willem suggestion) Series applied, but I wonder how many other commonly used drivers we should update the same way mlx4 is here?