On Fri, 7 Dec 2018 12:44:24 +0100 Björn Töpel <bjorn.to...@gmail.com> wrote:
> The rationale behind attach is performance and ease of use. Many XDP > socket users just need a simple way of creating/binding a socket and > receiving frames right away without loading an XDP program. > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > is a kernel provided XDP program that is installed to the netdev when > XDP_ATTACH is being passed as a bind() flag. > > The builtin program is the simplest program possible to redirect a > frame to an attached socket. In restricted C it would look like this: > > SEC("xdp") > int xdp_prog(struct xdp_md *ctx) > { > return bpf_xsk_redirect(ctx); > } > > The builtin program loaded via XDP_ATTACH behaves, from an > install-to-netdev/uninstall-from-netdev point of view, differently > from regular XDP programs. The easiest way to look at it is as a > 2-level hierarchy, where regular XDP programs has precedence over the > builtin one. > > If no regular XDP program is installed to the netdev, the builtin will > be install. If the builtin program is installed, and a regular is > installed, regular XDP program will have precedence over the builtin > one. > > Further, if a regular program is installed, and later removed, the > builtin one will automatically be installed. > > The sxdp_flags field of struct sockaddr_xdp gets two new options > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > corresponding XDP netlink install flags. > > The builtin XDP program functionally adds even more complexity to the > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > store the program in the struct net_device together with the install > flags instead of calling the ndo_bpf multiple times? (As far as I can see from reading the code, correct me if I'm wrong.) If an AF_XDP program uses XDP_ATTACH, then it installs the builtin-program as the XDP program on the "entire" device. That means all RX-queues will call this XDP-bpf program (indirect call), and it is actually only relevant for the specific queue_index. Yes, the helper call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' and return XDP_PASS if it is NULL: +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) +{ + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct xdp_sock *xsk; + + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); + if (xsk) { + ri->xsk = xsk; + return XDP_REDIRECT; + } + + return XDP_PASS; +} Why do every normal XDP_PASS packet have to pay this overhead (indirect call), when someone loads an AF_XDP socket program? The AF_XDP socket is tied hard and only relevant to a specific RX-queue (which is why we get a performance boost due to SPSC queues). I acknowledge there is a need for this, but this use-case shows there is a need for attaching XDP programs per RX-queue basis. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer