On Mon, 22 Jun 2020 15:45:46 +0300 Denis Kirjanov <k...@linux-powerpc.org> wrote:
> On 6/22/20, Jesper Dangaard Brouer <bro...@redhat.com> wrote: > > > > On Mon, 22 Jun 2020 12:21:11 +0300 Denis Kirjanov <k...@linux-powerpc.org> > > wrote: > > > >> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > >> index 482c6c8..1b9f49e 100644 > >> --- a/drivers/net/xen-netfront.c > >> +++ b/drivers/net/xen-netfront.c > > [...] > >> @@ -560,6 +572,65 @@ static u16 xennet_select_queue(struct net_device > >> *dev, struct sk_buff *skb, > >> return queue_idx; > >> } > >> > >> +static int xennet_xdp_xmit_one(struct net_device *dev, struct xdp_frame > >> *xdpf) > >> +{ > >> + struct netfront_info *np = netdev_priv(dev); > >> + struct netfront_stats *tx_stats = this_cpu_ptr(np->tx_stats); > >> + unsigned int num_queues = dev->real_num_tx_queues; > >> + struct netfront_queue *queue = NULL; > >> + struct xen_netif_tx_request *tx; > >> + unsigned long flags; > >> + int notify; > >> + > >> + queue = &np->queues[smp_processor_id() % num_queues]; > >> + > >> + spin_lock_irqsave(&queue->tx_lock, flags); > > > > Why are you taking a lock per packet (xdp_frame)? > Hi Jesper, > > We have to protect shared ring indices. Sure, I understand we need to protect the rings. What I'm asking is why are doing this per-packet, and not once for the entire bulk of packets? (notice how xennet_xdp_xmit gets a bulk of packets) > > > >> + > >> + tx = xennet_make_first_txreq(queue, NULL, > >> + virt_to_page(xdpf->data), > >> + offset_in_page(xdpf->data), > >> + xdpf->len); > >> + > >> + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&queue->tx, notify); > >> + if (notify) > >> + notify_remote_via_irq(queue->tx_irq); > >> + > >> + u64_stats_update_begin(&tx_stats->syncp); > >> + tx_stats->bytes += xdpf->len; > >> + tx_stats->packets++; > >> + u64_stats_update_end(&tx_stats->syncp); > >> + > >> + xennet_tx_buf_gc(queue); > >> + > >> + spin_unlock_irqrestore(&queue->tx_lock, flags); > > > > Is the irqsave/irqrestore variant really needed here? > > netpoll also invokes the tx completion handler. I forgot about netpoll. The netpoll code cannot call this code path xennet_xdp_xmit / xennet_xdp_xmit_one, right? Are the per-CPU ring queue's shared with normal network stack, that can be called from netpoll code path? queue = &np->queues[smp_processor_id() % num_queues]; > > > >> + return 0; > >> +} > >> + > >> +static int xennet_xdp_xmit(struct net_device *dev, int n, > >> + struct xdp_frame **frames, u32 flags) > >> +{ > >> + int drops = 0; > >> + int i, err; > >> + > >> + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) > >> + return -EINVAL; > >> + > >> + for (i = 0; i < n; i++) { > >> + struct xdp_frame *xdpf = frames[i]; > >> + > >> + if (!xdpf) > >> + continue; > >> + err = xennet_xdp_xmit_one(dev, xdpf); > >> + if (err) { > >> + xdp_return_frame_rx_napi(xdpf); > >> + drops++; > >> + } > >> + } > >> + > >> + return n - drops; > >> +} -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer