On 1/26/21 10:39 AM, Jesper Dangaard Brouer wrote:
> The current layout of net_device is not optimal for cacheline usage.
> 
> The member adj_list.lower linked list is split between cacheline 2 and 3.
> The ifindex is placed together with stats (struct net_device_stats),
> although most modern drivers don't update this stats member.
> 
> The members netdev_ops, mtu and hard_header_len are placed on three
> different cachelines. These members are accessed for XDP redirect into
> devmap, which were noticeably with perf tool. When not using the map
> redirect variant (like TC-BPF does), then ifindex is also used, which is
> placed on a separate fourth cacheline. These members are also accessed
> during forwarding with regular network stack. The members priv_flags and
> flags are on fast-path for network stack transmit path in __dev_queue_xmit
> (currently located together with mtu cacheline).
> 
> This patch creates a read mostly cacheline, with the purpose of keeping the
> above mentioned members on the same cacheline.
> 
> Some netdev_features_t members also becomes part of this cacheline, which is
> on purpose, as function netif_skb_features() is on fast-path via
> validate_xmit_skb().

A long over due look at the organization of this struct. Do you have
performance numbers for the XDP case?

Reply via email to