On Tue, 2026-04-14 at 19:23 +0800, KaFai Wan wrote:

AI is right and I'm late for the issue. Please ignore this. Sorry for the noise.

> A BPF_SOCK_OPS program can enable
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG and then call
> bpf_setsockopt(TCP_NODELAY) from BPF_SOCK_OPS_HDR_OPT_LEN_CB.
> 
> That reaches __tcp_sock_set_nodelay(), which may call
> tcp_push_pending_frames(). The transmit path then computes TCP
> options again, re-enters bpf_skops_hdr_opt_len(), and invokes the
> same BPF callback recursively. This can loop until the kernel
> stack overflows.
> 
> TCP_NODELAY is not safe from the header option callback context.
> Reject it with -EOPNOTSUPP when TCP header option callbacks are
> enabled on the socket, so the callback cannot recurse back into
> tcp_push_pending_frames() through do_tcp_setsockopt().
> 
> Reported-by: Quan Sun <[email protected]>
> Reported-by: Yinhao Hu <[email protected]>
> Reported-by: Kaiyan Mei <[email protected]>
> Closes: 
> https://lore.kernel.org/bpf/[email protected]/
> Fixes: 7e41df5dbba2 ("bpf: Add a few optnames to bpf_setsockopt")
> Signed-off-by: KaFai Wan <[email protected]>
> ---
>  net/ipv4/tcp.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 202a4e57a218..7ac4c98be19d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -4004,7 +4004,10 @@ int do_tcp_setsockopt(struct sock *sk, int level, int 
> optname,
>  
>       switch (optname) {
>       case TCP_NODELAY:
> -             __tcp_sock_set_nodelay(sk, val);
> +             if (val && BPF_SOCK_OPS_TEST_FLAG(tp, 
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG))
> +                     err = -EOPNOTSUPP;
> +             else
> +                     __tcp_sock_set_nodelay(sk, val);
>               break;
>  
>       case TCP_THIN_LINEAR_TIMEOUTS:

-- 
Thanks,
KaFai

Reply via email to