On Mon, May 18, 2026 at 8:04 AM Simon Horman <[email protected]> wrote:
>
> From: 'Simon Horman' <[email protected]>
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> netlink, audit: prevent false ENOBUFS on timeout expiry
>
> This commit modifies netlink_unicast() to return -EAGAIN when a blocking
> sender's timeout expires. This bypasses the netlink_overrun() call to
> allow the audit subsystem to gracefully enqueue the pending event into its
> internal backlog without falsely returning ENOBUFS to user-space.
>
> > diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> > --- a/net/netlink/af_netlink.c
> > +++ b/net/netlink/af_netlink.c
> > @@ -1351,8 +1351,18 @@ int netlink_unicast(struct sock *ssk, struct sk_buff 
> > *skb,
> >       }
> >
> >       err = netlink_attachskb(sk, skb, &timeo, ssk);
> > -     if (err == 1)
> > +     if (err == 1) {
> > +             /* timeo may have been zeroed by schedule_timeout inside
> > +              * netlink_attachskb. If the caller is a timed-blocking sender
> > +              * (not genuinely nonblocking), don't re-enter with timeo=0 as
> > +              * that would misfire netlink_overrun on the next iteration.
> > +              */
> > +             if (timeo == 0 && !nonblock) {
> > +                     kfree_skb(skb);
> > +                     return -EAGAIN;
>
> Could this regression cause silent packet drops in non-audit subsystems?
>
> While this change prevents false ENOBUFS for the audit subsystem, other
> subsystems like InfiniBand's IWPM use rdma_nl_unicast_wait(), which sets
> nonblock=0 and operates on a kernel socket with a timeout.
>
> If the timeout expires because the receiving user-space daemon is
> bottlenecked, this new path will free the skb and return -EAGAIN without
> ever invoking netlink_overrun().
>
> Without netlink_overrun(), the receiver's socket error is never set to
> ENOBUFS. Would this prevent user-space from being notified of the dropped
> message, leading to permanent state desynchronization between the kernel
> and user-space?
>
> > +             }
> >               goto retry;
> > +     }
> >       if (err)
> >               return err;
>

Hi Simon,

Thanks for reviewing this patch!

You are correct that the timeo == 0 && !nonblock heuristic in v2 relies on
an implicit assumption about finite sk_sndtimeo. While RDMA/IWPM with
MAX_SCHEDULE_TIMEOUT would never reach this path in practice, your concern
correctly identifies that the heuristic is not surgical enough.

I've submitted v3 [1] with an explicit NETLINK_UNICAST_TIMED constant
(value 2). Callers must explicitly opt into this contract, leaving IWPM and
all other subsystems completely untouched:

if (timeo == 0 && nonblock == NETLINK_UNICAST_TIMED)

This ensures zero risk of silent drops or state desynchronization in other
subsystems. Does this address your concern?

[1] 
https://lore.kernel.org/audit/[email protected]/T/#u

Best regards,
Ricardo


Reply via email to