On Mon, 30 Mar 2026 14:01:54 -0700 Dipayaan Roy wrote:
> On some ARM64 platforms with 4K PAGE_SIZE, page_pool fragment
> allocation in the RX refill path can cause 15-20% throughput
> regression under high connection counts (>16 TCP streams).

Did you investigate what makes such a difference exactly?
As I said I suspect there are some improvements we could
make in the page pool fragmentation logic that could yield
similar wins without bothering the user.

> Add an ethtool private flag "full-page-rx" that allows the user to
> force one RX buffer per page, bypassing the page_pool fragment path.
> This restores line-rate(180+ Gbps) performance on affected platforms.
> 
> Usage:
>   ethtool --set-priv-flags eth0 full-page-rx on
> 
> There is no behavioral change by default. The flag must be explicitly
> enabled by the user or udev rule.
> 
> The existing single-buffer-per-page logic for XDP and jumbo frames is
> consolidated into a new helper mana_use_single_rxbuf_per_page().

ethtool -g rx-buf-len could also fit the bill but I guess this is more
of a hack / workaround than legit config so no strong preference.

> -static void mana_get_strings(struct net_device *ndev, u32 stringset, u8 
> *data)
> +static void mana_get_strings_stats(struct mana_port_context *apc, u8 **data)
>  {
> -     struct mana_port_context *apc = netdev_priv(ndev);
>       unsigned int num_queues = apc->num_queues;
>       int i, j;
>  
> -     if (stringset != ETH_SS_STATS)
> -             return;
>       for (i = 0; i < ARRAY_SIZE(mana_eth_stats); i++)
> -             ethtool_puts(&data, mana_eth_stats[i].name);
> +             ethtool_puts(data, mana_eth_stats[i].name);
>  
>       for (i = 0; i < ARRAY_SIZE(mana_hc_stats); i++)
> -             ethtool_puts(&data, mana_hc_stats[i].name);
> +             ethtool_puts(data, mana_hc_stats[i].name);
>  
>       for (i = 0; i < ARRAY_SIZE(mana_phy_stats); i++)
> -             ethtool_puts(&data, mana_phy_stats[i].name);
> +             ethtool_puts(data, mana_phy_stats[i].name);
>  
>       for (i = 0; i < num_queues; i++) {
> -             ethtool_sprintf(&data, "rx_%d_packets", i);
> -             ethtool_sprintf(&data, "rx_%d_bytes", i);
> -             ethtool_sprintf(&data, "rx_%d_xdp_drop", i);
> -             ethtool_sprintf(&data, "rx_%d_xdp_tx", i);
> -             ethtool_sprintf(&data, "rx_%d_xdp_redirect", i);
> -             ethtool_sprintf(&data, "rx_%d_pkt_len0_err", i);
> +             ethtool_sprintf(data, "rx_%d_packets", i);

Please factor out the noisy, no-op prep work into a separate patch for
ease of review
-- 
pw-bot: cr

Reply via email to