On Thu, 7 May 2026 13:58:35 -0700 Stephen Hemminger <[email protected]> wrote:
> Bugzilla 1942: when a primary process exits cleanly, secondary > processes other than testpmd do not get notified. The notification > mechanism added in 25.11 was placed in testpmd and used > rte_mp_request_sync() with a testpmd-specific action name, so any > non-testpmd secondary (dpdk-dumpcap, dpdk-pdump, dpdk-procinfo, or > out-of-tree consumers) would log "Cannot find action: mp_testpmd" > and the primary would block on the 5 second request timeout. > > Putting application-specific IPC actions on a broadcast request path > is the wrong layer. Notification of primary exit is something every > secondary needs and should come from EAL, not from each application. > > Patch 1 reverts the testpmd-side mechanism (commit f96273c8e9d3). > The secondary-side primary alive monitor (enable_primary_monitor) > is preserved and continues to handle detection of primary exit via > the existing alarm-based polling of rte_eal_primary_proc_alive(). > > Patch 2 adds a generic EAL-level notification. On primary cleanup, > rte_mp_channel_cleanup() broadcasts an MP_REQ_QUIT message to all > known secondaries via rte_mp_sendmsg(). Secondaries register an > internal action handler that tears down their own MP channel on > receipt. No new public API; no application changes required for > any secondary, in-tree or out. > > This is the minimum fix suitable for backport to 25.11 stable. > It addresses the clean exit case. The crash case (primary killed > or signaled) continues to be handled by the existing > rte_eal_primary_proc_alive() polling on the secondary side, which > detects the primary's release of the config file lock. > > A more complete solution using a connected socket type (SOCK_SEQPACKET) > is planned since that can handle both planned and forced exiting > of the primary. > > Tested with testpmd (primary) and dpdk-dumpcap, dpdk-pdump, > dpdk-procinfo (secondaries). > > > Stephen Hemminger (2): > Revert "app/testpmd: stop forwarding in secondary process" > eal: notify secondary on primary exit > > app/test-pmd/testpmd.c | 103 ++----------------------------- > lib/eal/common/eal_common_proc.c | 51 ++++++++++++++- > 2 files changed, 53 insertions(+), 101 deletions(-) > This is a real bug fix. Even if AI is being overly wordy in describing it; what it does is move the notification from being testpmd -> testpmd only to a more general primary -> secondary mechanism. Without it can demonstrate the bug rather trivially with packet capture tools (pdump or dumpcap) Could this get reviewed?

