2017-11-14 18:19 GMT+01:00 Jesper Dangaard Brouer <bro...@redhat.com>: > > On Mon, 13 Nov 2017 22:07:47 +0900 Björn Töpel <bjorn.to...@gmail.com> wrote: > >> I'll summarize the major points, that we'll address in the next RFC >> below. >> >> * Instead of extending AF_PACKET with yet another version, introduce a >> new address/packet family. As for naming had some name suggestions: >> AF_CAPTURE, AF_CHANNEL, AF_XDP and AF_ZEROCOPY. We'll go for >> AF_ZEROCOPY, unless there're no strong opinions against it. > > I mostly like AF_CHANNEL and AF_XDP. I do know XDP is/have-evolved-into > a kernel-side facility, that moves XDP-frames/packets _inside_ the > kernel. > > *BUT* I've always imagined, that we would create a "channel" to > userspace. By using XDP_REDIRECT to choose what frames get redirected > into which userspace "channel" (new channel-map type). Userspace > pre-allocate and register memory/pages exactly like this patchset. > > [Step-1]: (non-ZC) XDP_REDIRECT need to copy frame-data into userspace > memory pages. And update your packet_array etc. (Use map-flush to get > RX bulking). > > [Step 2]: (ZC) Userspace call driver NDO to register pages. The > XDP_REDIRECT action happens in driver, and can have knowledge about > RX-ring. It can know if this RX-ring is Zero-Copy enabled and can skip > the copy-step. >
Jesper, I *really* like this approach -- especially the fact that the existing XDP path in the drivers can be reused. I'll spend some time dissecting the details of your suggestion. >> * No explicit zerocopy enablement. Use the zeropcopy path if >> supported, if not -- fallback to the skb path, for netdevs that >> don't support the required ndos. > > When driver does not support NDO in above model. I think, that there > will still be a significant performance boost for the non-ZC variant. > Even-though we need a copy-operation, because there are no memory > allocations. As userspace have preallocated and registered pages with > the kernel (and mem-limits are implicit via mem-size reg by userspace). > Yup, and we're not paying for the whole skb creation, given that we execute from XDP_DRV and not XDP_SKB. >> * Do not introduce a new XDP action XDP_PASS_TO_KERNEL, instead use >> XDP redirect map call with ingress flag. > > In above model, XDP_REDIRECT is used for filtering into a userspace > "channel". If ZC gets enabled on a RX-ring queue, then XDP_PASS have > to do a copy (RX-ring knowledge is avail), like you describe with > XDP_PASS_TO_KERNEL. > Again, this fits nicely in. >> * Extend the XDP redirect to support explicit allocator/destructor >> functions. Right now, XDP redirect assumes that the page allocator >> was used, and the XDP redirect cleanup path is decreasing the page >> count of the XDP buffer. This assumption breaks for the zerocopy >> case. > > Yes, please. If XDP_REDIRECT get call a destructor call-back, then we > can allow XDP_REDIRECT out another net_device, even-when ZC is enabled > on a RX-ring queue. > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer