On Thu,  7 May 2026 13:58:35 -0700
Stephen Hemminger <[email protected]> wrote:

> Bugzilla 1942: when a primary process exits cleanly, secondary
> processes other than testpmd do not get notified.  The notification
> mechanism added in 25.11 was placed in testpmd and used
> rte_mp_request_sync() with a testpmd-specific action name, so any
> non-testpmd secondary (dpdk-dumpcap, dpdk-pdump, dpdk-procinfo, or
> out-of-tree consumers) would log "Cannot find action: mp_testpmd"
> and the primary would block on the 5 second request timeout.
> 
> Putting application-specific IPC actions on a broadcast request path
> is the wrong layer.  Notification of primary exit is something every
> secondary needs and should come from EAL, not from each application.
> 
> Patch 1 reverts the testpmd-side mechanism (commit f96273c8e9d3).
> The secondary-side primary alive monitor (enable_primary_monitor)
> is preserved and continues to handle detection of primary exit via
> the existing alarm-based polling of rte_eal_primary_proc_alive().
> 
> Patch 2 adds a generic EAL-level notification.  On primary cleanup,
> rte_mp_channel_cleanup() broadcasts an MP_REQ_QUIT message to all
> known secondaries via rte_mp_sendmsg().  Secondaries register an
> internal action handler that tears down their own MP channel on
> receipt.  No new public API; no application changes required for
> any secondary, in-tree or out.
> 
> This is the minimum fix suitable for backport to 25.11 stable.
> It addresses the clean exit case.  The crash case (primary killed
> or signaled) continues to be handled by the existing
> rte_eal_primary_proc_alive() polling on the secondary side, which
> detects the primary's release of the config file lock.
> 
> A more complete solution using a connected socket type (SOCK_SEQPACKET)
> is planned since that can handle both planned and forced exiting
> of the primary.
> 
> Tested with testpmd (primary) and dpdk-dumpcap, dpdk-pdump,
> dpdk-procinfo (secondaries).
> 
> 
> Stephen Hemminger (2):
>   Revert "app/testpmd: stop forwarding in secondary process"
>   eal: notify secondary on primary exit
> 
>  app/test-pmd/testpmd.c           | 103 ++-----------------------------
>  lib/eal/common/eal_common_proc.c |  51 ++++++++++++++-
>  2 files changed, 53 insertions(+), 101 deletions(-)
> 

This is a real bug fix. Even if AI is being overly wordy in describing
it; what it does is move the notification from being testpmd -> testpmd only
to a more general primary -> secondary mechanism.

Without it can demonstrate the bug rather trivially with packet
capture tools (pdump or dumpcap)

Could this get reviewed?

Reply via email to