On 5/21/26 4:25 PM, Nicolas Dichtel wrote: > Le 21/05/2026 à 16:00, Jiri Benc a écrit : >> On Thu, 21 May 2026 14:36:12 +0200, Nicolas Dichtel wrote: >>> I still don't think that this is the right "fix". The app is broken. Even >>> after >>> this patch, the bug could be easily triggered again by a third party. >>> There is nothing wrong with assigning a self-nsid. It would be a lot more >>> robust >>> for the app to assign itself a self-nsid when it starts. >> >> On the other hand, does the patch break anything in practice (as >> opposed to in theory)? It makes live of several apps simpler, which is >> not a bad goal. > I'm not against the patch, it just look like a workaround. > I'm trying to understand how NETLINK_LISTEN_ALL_NSID is used (in fact, why it > is > used if the app doesn't "understand" NSIDs).
ovs-vswitchd works with NSIDs of remote ports. So it does understand them, it just doesn't expect the self-referential ones for the local namespace. openvswitch module has a minimal support for cross-namespace operation. Ports can be added to the openvswitch datapath and then moved to a different namespace (it's a little weird use case, but that's beyond the point here). ovs-vswitchd learns new NSIDs of those ports from the openvswitch module and then it can perform a limited set of cross-namespace operations on them and monitor their status changes through notifications on an all-nsid socket. It never learns the NSID of the current local namespace, because all the local ports can be directly accessed and openvswitch module doesn't report an NSID for them, as it's not needed for anything. In the end, ovs-vswitchd knows all the remote NSIDs it needs to know and can recognize them in notifications. But it doesn't know the NSID of it's own local namespace, as the openvswitch module never reports that for local ports and ovs-vswitchd doesn't explicitly check its own NSID. So, local notifications with NSID set get treatment of a notification from some remote namespace that we do not care about. We will be putting changes into ovs-vswitch to work around this issue, simply because it will take time for the kernel patch to propagate to distros. But this code will not be useful for anything except for working around this one specific case and so it would be nice to get rid of it eventually. And it would be nice if future applications didn't need to care about this behavior as well. Having the fix in stable will speed up the process significantly. HTH, Best regards, Ilya Maximets.

