On Wed, Apr 01, 2026 at 09:46:37PM -0700, Laurence Rowe wrote:
> A common pattern in epoll network servers is to eagerly accept all
> pending connections from the non-blocking listening socket after
> epoll_wait indicates the socket is ready by calling accept in a loop
> until EAGAIN is returned indicating that the backlog is empty.
> 
> Scheduling a timeout for a non-blocking accept with an empty backlog
> meant AF_VSOCK sockets used by epoll network servers incurred hundreds
> of microseconds of additional latency per accept loop compared to
> AF_INET or AF_UNIX sockets.
> 
> Signed-off-by: Laurence Rowe <[email protected]>
> ---
> 
> This fixes the observed issue for me:
> 
> 1. With loopback vsock on the host running Linux v6.19.10 built with
> config-6.17.0-19-generic from Ubuntu 24.04 and make olddefconfig.
> 
> 2. With Firecracker guests with current torvalds/master, v6.19.10, and
> amazonlinux/microvm-kernel-6.1.166-24.303.amzn2023 used in Firecracker
> CI and examples. (Firecracker guest vsocks are unix sockets on the host
> side so this fix works there with just a fixed guest kernel.)
> 
> I struggled to build a generic 6.1.166 kernel that worked as a
> Firecracker guest but the patch applies (conflict due to change of
> `flags` to `arg->flags` in surrounding context) so I believe it should
> work for generic v6.1.166 kernel.
> 
> Alternatively a minimal version of this fix is to just wrap the
> `schedule_timeout` in an `if (timeout != 0)` but that leaves an
> unnecessary additional `lock_sock` call.
> 
> There are ftrace's and reproduction tools at:
> https://github.com/lrowe/linux-vsock-accept-timeout-investigation
> ---
>  net/vmw_vsock/af_vsock.c | 16 +++++++---------
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index 2f7d94d682..483889b6d8 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -1850,11 +1850,11 @@ static int vsock_accept(struct socket *sock, struct 
> socket *newsock,
>        * created upon connection establishment.
>        */
>       timeout = sock_rcvtimeo(listener, arg->flags & O_NONBLOCK);
> -     prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
>  
>       while ((connected = vsock_dequeue_accept(listener)) == NULL &&
> -            listener->sk_err == 0) {
> +            listener->sk_err == 0 && timeout != 0) {
>               release_sock(listener);
> +             prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
>               timeout = schedule_timeout(timeout);
>               finish_wait(sk_sleep(listener), &wait);
>               lock_sock(listener);
> @@ -1862,17 +1862,15 @@ static int vsock_accept(struct socket *sock, struct 
> socket *newsock,
>               if (signal_pending(current)) {
>                       err = sock_intr_errno(timeout);
>                       goto out;
> -             } else if (timeout == 0) {
> -                     err = -EAGAIN;
> -                     goto out;
>               }
> -
> -             prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
>       }
> -     finish_wait(sk_sleep(listener), &wait);
>  
> -     if (listener->sk_err)
> +     if (listener->sk_err) {
>               err = -listener->sk_err;
> +     } else if (timeout == 0 && connected == NULL) {
> +             err = -EAGAIN;
> +             goto out;
> +     }

I wonder if this goto can be omitted since the following 'if
(connected)' guards the connected != NULL case? I don't have a strong
opinion, just noticed it would keep if-else symmetrical.

All-in-all, LGTM.

Reviewed-by: Bobby Eshleman <[email protected]>

Reply via email to