> -----Original Message-----
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Thursday, December 24, 2020 1:56 PM
> To: wangyunjian <wangyunj...@huawei.com>
> Cc: netdev@vger.kernel.org; m...@redhat.com;
> willemdebruijn.ker...@gmail.com; virtualizat...@lists.linux-foundation.org;
> Lilijun (Jerry) <jerry.lili...@huawei.com>; chenchanghu
> <chenchan...@huawei.com>; xudingke <xudin...@huawei.com>; huangbin (J)
> <brian.huang...@huawei.com>
> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg
> fails
> 
> 
> On 2020/12/24 下午12:37, wangyunjian wrote:
> >> -----Original Message-----
> >> From: Jason Wang [mailto:jasow...@redhat.com]
> >> Sent: Thursday, December 24, 2020 11:10 AM
> >> To: wangyunjian <wangyunj...@huawei.com>; netdev@vger.kernel.org;
> >> m...@redhat.com; willemdebruijn.ker...@gmail.com
> >> Cc: virtualizat...@lists.linux-foundation.org; Lilijun (Jerry)
> >> <jerry.lili...@huawei.com>; chenchanghu <chenchan...@huawei.com>;
> >> xudingke <xudin...@huawei.com>; huangbin (J)
> >> <brian.huang...@huawei.com>
> >> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when
> >> sendmsg fails
> >>
> >>
> >> On 2020/12/24 上午10:25, wangyunjian wrote:
> >>> From: Yunjian Wang <wangyunj...@huawei.com>
> >>>
> >>> Currently the driver doesn't drop a packet which can't be sent by
> >>> tun (e.g bad packet). In this case, the driver will always process
> >>> the same packet lead to the tx queue stuck.
> >>>
> >>> To fix this issue:
> >>> 1. in the case of persistent failure (e.g bad packet), the driver
> >>> can skip this descriptor by ignoring the error.
> >>> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> >>> driver schedules the worker to try again.
> >>
> >> I might be wrong but looking at alloc_skb_with_frags(), it returns
> >> -ENOBUFS
> >> actually:
> >>
> >>       *errcode = -ENOBUFS;
> >>       skb = alloc_skb(header_len, gfp_mask);
> >>       if (!skb)
> >>           return NULL;
> > Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space.
> > So there is need to check return value which is -EAGAIN or -ENOMEM or -
> EAGAIN?
> >
> > struct sk_buff *sock_alloc_send_pskb() { ...
> >     for (;;) {
> > ...
> >             sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> >             set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> >             err = -EAGAIN;
> >             if (!timeo)
> >                     goto failure;
> > ...
> >     }
> >     skb = alloc_skb_with_frags(header_len, data_len, max_page_order,
> >                                errcode, sk->sk_allocation);
> >     if (skb)
> >             skb_set_owner_w(skb, sk);
> >     return skb;
> > ...
> >     *errcode = err;
> >     return NULL;
> > }
> 
> 
> -EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned.

The tun_build_skb() and tun_napi_alloc_frags() return -ENOMEM when memory
allocation fails.

Thanks

> 
> Thanks
> 
> 
> >> Thanks
> >>
> >>
> >>> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
> >>> Signed-off-by: Yunjian Wang <wangyunj...@huawei.com>
> >>> ---
> >>>    drivers/vhost/net.c | 16 ++++++++--------
> >>>    1 file changed, 8 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> >>> c8784dfafdd7..e76245daa7f6 100644
> >>> --- a/drivers/vhost/net.c
> >>> +++ b/drivers/vhost/net.c
> >>> @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net
> >>> *net,
> >> struct socket *sock)
> >>>                                   msg.msg_flags &= ~MSG_MORE;
> >>>                   }
> >>>
> >>> -         /* TODO: Check specific error and bomb out unless ENOBUFS?
> */
> >>>                   err = sock->ops->sendmsg(sock, &msg, len);
> >>> -         if (unlikely(err < 0)) {
> >>> +         if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
> >>>                           vhost_discard_vq_desc(vq, 1);
> >>>                           vhost_net_enable_vq(net, vq);
> >>>                           break;
> >>>                   }
> >>> -         if (err != len)
> >>> +         if (err >= 0 && err != len)
> >>>                           pr_debug("Truncated TX packet: len %d != %zd\n",
> >>>                                    err, len);
> >>>    done:
> >>> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net
> >> *net, struct socket *sock)
> >>>                           msg.msg_flags &= ~MSG_MORE;
> >>>                   }
> >>>
> >>> -         /* TODO: Check specific error and bomb out unless ENOBUFS?
> */
> >>>                   err = sock->ops->sendmsg(sock, &msg, len);
> >>>                   if (unlikely(err < 0)) {
> >>>                           if (zcopy_used) {
> >>> @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct
> >>> vhost_net
> >> *net, struct socket *sock)
> >>>                                   nvq->upend_idx = 
> >>> ((unsigned)nvq->upend_idx - 1)
> >>>                                           % UIO_MAXIOV;
> >>>                           }
> >>> -                 vhost_discard_vq_desc(vq, 1);
> >>> -                 vhost_net_enable_vq(net, vq);
> >>> -                 break;
> >>> +                 if (err == -EAGAIN || err == -ENOMEM) {
> >>> +                         vhost_discard_vq_desc(vq, 1);
> >>> +                         vhost_net_enable_vq(net, vq);
> >>> +                         break;
> >>> +                 }
> >>>                   }
> >>> -         if (err != len)
> >>> +         if (err >= 0 && err != len)
> >>>                           pr_debug("Truncated TX packet: "
> >>>                                    " len %d != %zd\n", err, len);
> >>>                   if (!zcopy_used)

Reply via email to