Hi Jakub, Thanks for reviewing!
On 2018/07/24 9:23, Jakub Kicinski wrote: > On Mon, 23 Jul 2018 00:13:02 +0900, Toshiaki Makita wrote: >> From: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> >> >> This is the basic implementation of veth driver XDP. >> >> Incoming packets are sent from the peer veth device in the form of skb, >> so this is generally doing the same thing as generic XDP. >> >> This itself is not so useful, but a starting point to implement other >> useful veth XDP features like TX and REDIRECT. >> >> This introduces NAPI when XDP is enabled, because XDP is now heavily >> relies on NAPI context. Use ptr_ring to emulate NIC ring. Tx function >> enqueues packets to the ring and peer NAPI handler drains the ring. >> >> Currently only one ring is allocated for each veth device, so it does >> not scale on multiqueue env. This can be resolved by allocating rings >> on the per-queue basis later. >> >> Note that NAPI is not used but netif_rx is used when XDP is not loaded, >> so this does not change the default behaviour. >> >> v3: >> - Fix race on closing the device. >> - Add extack messages in ndo_bpf. >> >> v2: >> - Squashed with the patch adding NAPI. >> - Implement adjust_tail. >> - Don't acquire consumer lock because it is guarded by NAPI. >> - Make poll_controller noop since it is unnecessary. >> - Register rxq_info on enabling XDP rather than on opening the device. >> >> Signed-off-by: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> > >> +static struct sk_buff *veth_xdp_rcv_skb(struct veth_priv *priv, >> + struct sk_buff *skb) >> +{ >> + u32 pktlen, headroom, act, metalen; >> + void *orig_data, *orig_data_end; >> + int size, mac_len, delta, off; >> + struct bpf_prog *xdp_prog; >> + struct xdp_buff xdp; >> + >> + rcu_read_lock(); >> + xdp_prog = rcu_dereference(priv->xdp_prog); >> + if (unlikely(!xdp_prog)) { >> + rcu_read_unlock(); >> + goto out; >> + } >> + >> + mac_len = skb->data - skb_mac_header(skb); >> + pktlen = skb->len + mac_len; >> + size = SKB_DATA_ALIGN(VETH_XDP_HEADROOM + pktlen) + >> + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); >> + if (size > PAGE_SIZE) >> + goto drop; >> + >> + headroom = skb_headroom(skb) - mac_len; >> + if (skb_shared(skb) || skb_head_is_locked(skb) || >> + skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) { >> + struct sk_buff *nskb; >> + void *head, *start; >> + struct page *page; >> + int head_off; >> + >> + page = alloc_page(GFP_ATOMIC); >> + if (!page) >> + goto drop; >> + >> + head = page_address(page); >> + start = head + VETH_XDP_HEADROOM; >> + if (skb_copy_bits(skb, -mac_len, start, pktlen)) { >> + page_frag_free(head); >> + goto drop; >> + } >> + >> + nskb = veth_build_skb(head, >> + VETH_XDP_HEADROOM + mac_len, skb->len, >> + PAGE_SIZE); >> + if (!nskb) { >> + page_frag_free(head); >> + goto drop; >> + } > >> +static int veth_enable_xdp(struct net_device *dev) >> +{ >> + struct veth_priv *priv = netdev_priv(dev); >> + int err; >> + >> + if (!xdp_rxq_info_is_reg(&priv->xdp_rxq)) { >> + err = xdp_rxq_info_reg(&priv->xdp_rxq, dev, 0); >> + if (err < 0) >> + return err; >> + >> + err = xdp_rxq_info_reg_mem_model(&priv->xdp_rxq, >> + MEM_TYPE_PAGE_SHARED, NULL); > > nit: doesn't matter much but looks like a mix of MEM_TYPE_PAGE_SHARED > and MEM_TYPE_PAGE_ORDER0 Actually I'm not sure when to use MEM_TYPE_PAGE_ORDER0. It seems a page allocated by alloc_page() can be freed by page_frag_free() and it is more lightweight than put_page(), isn't it? virtio_net is doing it in a similar way. -- Toshiaki Makita