Hello Maxime,

On Tue, 16 Jun 2026 at 12:28, Maxime Leroy <[email protected]> wrote:
>
> Implement .rx_queue_intr_enable / .rx_queue_intr_disable so a worker
> can sleep on a queue's data-availability notification instead of
> busy-polling, through the generic rte_eth_dev_rx_intr_* API.
>
> A worker wakes on its software portal's DQRI, which fires when the
> portal's DQRR holds frames, so the Rx FQ must be scheduled to a channel
> that portal dequeues. The natural dpni_set_queue with a notification
> destination holds the global MC lock long enough to wedge the firmware
> and must target a disabled dpni. But the polling portal is only known
> once a worker affines, after dev_start, so the destination cannot be
> the worker's portal.
>
> Bind each Rx FQ to its own DPCON channel instead. The default Rx burst
> pulls frames from the FQ with a volatile dequeue and cannot be
> interrupt-driven; to wake on the DQRI the FQ must be pushed to the
> portal's DQRR. dev_start issues the DEST_DPCON set_queue statically on
> the still-disabled dpni with no knowledge of the polling lcore; a worker
> later subscribes its own ethrx portal to the channel and arms the DQRI
> in rx_queue_intr_enable (a one-shot per-portal MC op plus QBMan, never
> the wedging set_queue).
>
> This pushed/DQRR consumption is how the event PMD works, but the DPCON
> use differs. The event PMD uses one DPCON per worker, concentrates N
> FQs onto it, and lets the QBMan scheduler load-balance events across
> cores. Here affinity is static and there is no scheduling, so each FQ
> gets its own DPCON (one per FQ, more channels, drawn from the shared
> pool that the DPCON move to the fslmc bus now feeds), bound once at
> dev_start before the lcore is known. Frames are delivered by
> rte_eth_rx_burst (dpaa2_dev_rx_dqrr), not as events via
> rte_event_dequeue.
>
> rte_eth_dev_rx_intr_enable(q) subscribes the lcore portal to q's DPCON
> and arms the DQRI. rte_eth_dev_rx_intr_ctl_q(q) adds q's eventfd (the
> portal DQRI fd) to the thread epoll.
>
>       wire
>        |
>     [ DPMAC ]
>        |
>     [ DPNI ]                                     (1)
>        |
>     TC0:  FQ0   FQ1   FQ2   FQ3                  (2)
>            |     |     |     |                   (3)
>         [DPCON][DPCON][DPCON][DPCON]
>             \     |     |     /                  (4)
>           [ DPIO A ]      [ DPIO B ]             (5)
>              |               |
>             DQRR            DQRR                 (6)
>              |               |
>             DQRI            DQRI                 (7)
>              |               |
>           eventfd         eventfd                (8)
>              |               |
>         rte_epoll_wait  rte_epoll_wait           (9)
>              |               |
>         dpaa2_dev_rx_dqrr                        (10)
>
>   (1)  WRIOP picks a TC (QoS), then RSS-hashes within the TC to an FQ
>   (2)  FQ0..FQ3 are the rte_eth Rx queues
>   (3)  dpni_set_queue(DEST_DPCON): one DPCON per FQ
>   (4)  the lcore portal subscribes to its DPCONs (push_set)
>   (5)  one QBMan software portal per lcore
>   (6)  QMan pushes the FDs into the portal DQRR
>   (7)  DQRI is raised when the DQRR is non-empty
>   (8)  a portal's queues share one fd (its DQRI eventfd)
>   (9)  worker sleeps here when all its queues are idle
>   (10) dpaa2_dev_rx_dqrr drains the DQRR, demuxes FDs to FQs by fqd_ctx
>
> The DQRI and eventfd are portal-wide: a queue's eventfd is its portal's
> DQRI fd, and the inhibit bit is refcounted by armed queues so disabling
> one queue never masks a sibling. The static per-queue bind also lets a
> queue be re-homed to another lcore at runtime, the new worker
> reclaiming the channel, with no set_queue and no port stop.
>
> On single-core 64-byte forwarding this interrupt path runs at ~5.0 Mpps
> versus ~5.86 Mpps polling: per-frame DQRR demux and consume cost about
> 15 percent over the polling batch dequeue.
>
> Signed-off-by: Maxime Leroy <[email protected]>

I did not review in detail, but one aspect caught my eye:

[snip]

> diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c 
> b/drivers/net/dpaa2/dpaa2_ethdev.c
> index 803a8321e0..61e7c820de 100644
> --- a/drivers/net/dpaa2/dpaa2_ethdev.c
> +++ b/drivers/net/dpaa2/dpaa2_ethdev.c

[snip]

> @@ -845,6 +853,19 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
>                 }
>         }
>
> +       if (dev->data->dev_conf.intr_conf.rxq) {
> +               if (!dev->intr_handle)
> +                       dev->intr_handle = 
> rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);

Something is strange here.

I plan to move this allocation in the probe_device handler of the bus
(https://patchwork.dpdk.org/project/dpdk/patch/[email protected]/).
However, even without this change of mine, the intr_handle should be
allocated in the bus code (see allocation in scan_one_fslmc_device).
A NULL pointer during configure indicates a bug somewhere around the
device pointer life.


> +               if (!dev->intr_handle ||
> +                   rte_intr_vec_list_alloc(dev->intr_handle, "rxq_intr",
> +                               dev->data->nb_rx_queues) ||
> +                   rte_intr_nb_efd_set(dev->intr_handle, 
> dev->data->nb_rx_queues) ||
> +                   rte_intr_type_set(dev->intr_handle, RTE_INTR_HANDLE_EXT)) 
> {
> +                       DPAA2_PMD_ERR("Failed to set up rx-queue interrupts");
> +                       return -rte_errno;
> +               }
> +       }
> +
>         dpaa2_tm_init(dev);
>
>         return 0;


-- 
David Marchand

Reply via email to