On Wed, 17 Jun 2026 16:28:23 +0800
Junlong Wang <[email protected]> wrote:
> v6:
> - Remove unnecessary error checking code in submit_to_backend_simple() and
> pkt_padding(). Since as the max dl_net_hdr_len is always less than
> RTE_PKTMBUF_HEADROOM, rte_pktmbuf_prepend() cannot fail in the
> simple path (single-segment mbufs).
>
> v5:
> - Reorganize patch series, placing interrupt fix as the first patch
> and fix condition check to properly enable interrupts.
> - Fix zxdh_recv_single_pkts() not compacting rcv_pkts[] on failure,
> which could cause use-after-free and mbuf leak.
> - Fix tx_bunch() and tx1() missing store barrier before setting AVAIL flag,
> preventing data race on weakly-ordered architectures.
> - Fix submit_to_backend_simple() writing descriptors for packets that
> failed pkt_padding(), causing mbuf leak.
>
> v4:
> - fix some AI review issues.
> - fix queue enable intr bug.
>
> v3:
> - remove unnecessary NULL check in zxdh_init_queue.
> - Split Ring: Bit[31] is unused and reserved, zxdh_queue_notify(): removing
> the
> zxdh_pci_with_feature(hw, ZXDH_F_RING_PACKED) check;
> - remove unnecessary double-free in in zxdh_recv_single_pkts();
> - used rte_pktmbuf_mtod();
> - remove rxq_get_vq(q) macro, use q->vq and apply it consistently;
> - Refactoring scatter and mtu check logic in zxdh_dev_mtu_set();
> - set txdp->id = avail_idx + i in tx_bunch/tx1.
> - add comment documenting zxdh_xmit_enqueue_append() now sets dxp->cookie =
> NULL for
> the head slot and stores cookies per descriptor via dep[idx].cookie.
> - add one-line comment noting tx_bunch() is the simple path handles
> single-segment.
> - remove unnecessary Extra initialization and the uint32_t cast.
>
> v2:
> - zxdh_rxtx.c, pkt_padding(): modifyed the return value of pkt_padding();
> - zxdh_rxtx.c, zxdh_recv_single_pkts(): modifyed When zxdh_init_mbuf() fails
> the loop does "continue" and free mbufs;
> - zxdh_rxtx.c, refill_desc_unwrap(): Add rte_io_wmb() before writing flags
> in the refill_que_descs();
> - zxdh_queue.h, zxdh_queue_enable_intr(): Remove unnecessary function of
> zxdh_queue_enable_intr;
> - zxdh_ethdev.c, zxdh_init_queue(): changed the hdr_mz NULL check logic;
>
> - zxdh_rxtx.c, zxdh_xmit_pkts_simple()、zxdh_recv_single_pkts(): add
> stats.bytes count;
> - zxdh_rxtx.c, zxdh_init_mbuf():remove rte_pktmbuf_dump(stdout, rxm, 40);
> - zxdh_ethdev.c, zxdh_dev_free_mbufs(): using rte_pktmbuf_free() to free
> mbufs;
> - Splitting into separate patches, structure reorganization and sw_ring
> removal、
> RX recv optimize、Tx xmit optimize、Tx;
>
> v1:
> This patch optimizes the ZXDH PMD's receive and transmit path for better
> performance through several improvements:
>
> - Add simple TX/RX burst functions (zxdh_xmit_pkts_simple and
> zxdh_recv_single_pkts) for single-segment packet scenarios.
> - Remove RX software ring (sw_ring) to reduce memory allocation and
> copy.
> - Optimize descriptor management with prefetching and simplified
> cleanup.
> - Reorganize structure fields for better cache locality.
>
> These changes reduce CPU cycles and memory bandwidth consumption,
> resulting in improved packet processing throughput.
>
> Junlong Wang (4):
> net/zxdh: fix queue enable intr issues
> net/zxdh: optimize queue structure to improve performance
> net/zxdh: optimize Rx recv pkts performance
> net/zxdh: optimize Tx xmit pkts performance
>
> drivers/net/zxdh/zxdh_ethdev.c | 81 ++---
> drivers/net/zxdh/zxdh_ethdev_ops.c | 23 +-
> drivers/net/zxdh/zxdh_ethdev_ops.h | 4 +
> drivers/net/zxdh/zxdh_pci.c | 2 +-
> drivers/net/zxdh/zxdh_queue.c | 11 +-
> drivers/net/zxdh/zxdh_queue.h | 122 ++++---
> drivers/net/zxdh/zxdh_rxtx.c | 529 ++++++++++++++++++++++-------
> drivers/net/zxdh/zxdh_rxtx.h | 27 +-
> 8 files changed, 539 insertions(+), 260 deletions(-)
>
I thought this was better, but AI found one new bug.
Also, if the driver wants headroom in mbuf and headroom typically
is set from rte_pktmbuf_alloc, a good defensive in depth.
Something like:
static_assert(RTE_PKTMBUF_HEADROOM >= ZXDH_DL_NET_HDR_SIZE,
"RTE_PKTMBUF_HEADROOM too small for zxdh Tx downlink header");
You still need the check in transmit, but this would protect
against user misconfiguration.
Not sure why you needed to open code rte_pktmbuf_prepend() which
has the check that AI wants you to have
Series review: net/zxdh Rx/Tx optimization (v6)
Patches 1-3 are unchanged from v5 and remain good. The v5 issue in the
simple Tx path (leaked mbuf and unpublished descriptor on the
pkt_padding failure branch) is resolved: pkt_padding() no longer
returns a failure, so there is no skip/continue and no ring gap. The
way it was made infallible introduces a new problem.
[PATCH v6 4/4] net/zxdh: optimize Tx xmit pkts performance
Error: pkt_padding() writes the downlink header in place without
checking that the mbuf has hdr_len bytes of headroom, so a packet with
short headroom causes an out-of-bounds write and a data_off underflow.
hdr = rte_pktmbuf_mtod_offset(cookie, struct zxdh_net_hdr_dl *,
-hdr_len);
memcpy(hdr, net_hdr_dl, hdr_len);
/* Update mbuf to reflect the prepended header */
cookie->data_off -= hdr_len;
cookie->data_len += hdr_len;
cookie->pkt_len += hdr_len;
This open-codes rte_pktmbuf_prepend() but drops its headroom check. If
cookie->data_off < hdr_len, the mtod_offset pointer lands before
buf_addr and the rte_memcpy scribbles over the mbuf's private area or
the preceding memory; cookie->data_off (uint16_t) then underflows to a
large value. v4/v5 used rte_pktmbuf_prepend() here, which returns NULL
when headroom is short, so this is a regression: a checked-but-mishandled
case became an unchecked memory corruption.
Nothing upstream of the simple path guarantees the headroom. The simple
Tx burst is selected solely on RTE_ETH_TX_OFFLOAD_MULTI_SEGS being
disabled, which is independent of headroom, and tx_prepare()'s
dl_net_hdr_check() only validates offload capability, not headroom. The
driver itself shows the headroom cannot be assumed: the packed path
gates the in-place prepend on
txm->data_off >= ZXDH_DL_NET_HDR_SIZE
(zxdh_xmit_pkts_packed, the can_push test) and falls back to
zxdh_xmit_enqueue_append(), which copies the header into the reserved
txr region instead of prepending. The simple path has no equivalent
guard.
Restore a headroom check before the in-place prepend. For a
short-headroom packet, fall back to the append path that copies the
header into the reserved region (as the packed path does) rather than
writing before the buffer.