2017-11-16 4:35 GMT+01:00 Willem de Bruijn <willemdebruijn.ker...@gmail.com>: > On Wed, Nov 15, 2017 at 9:55 PM, Alexei Starovoitov <a...@fb.com> wrote: >> On 11/14/17 4:20 AM, Willem de Bruijn wrote: >>>>>>> >>>>>>> >>>>>>> * Limit the scope of the first patchset to Rx only, and introduce Tx >>>>>>> in a separate patchset. >>>>>> >>>>>> >>>>>> >>>>>> all sounds good to me except above bit. >>>>>> I don't remember people suggesting to split it this way. >>>>>> What's the value of it without tx? >>>>>> >>>>> >>>>> We definitely need Tx for our use-cases! I'll rephrase, so the >>>>> idea was making the initial patch set without Tx *driver* >>>>> specific code, e.g. use ndo_xdp_xmit/flush at a later point. >>>>> >>>>> So AF_ZEROCOPY, the socket parts, would have Tx support. >>>>> >>>>> @John Did I recall that correctly? >>>>> >>>> >>>> Yep, that is what I said. However, on second thought, without the >>>> driver tx half I guess tx will be significantly slower. >>> >>> >>> The idea was that existing packet rings already send without >>> copying, so the benefit from device driver changes is not obvious. >>> >>> I would leave them out for now and evaluate before possibly >>> sending a separate patchset. >> >> >> are you suggesting to use new af_zerocopy for rx and old >> af_packet for tx ? imo that's too cumbersome to use. >> New interface has to be symmetrical from the start. > > No, that tx can be implemented without device driver > changes. At least initially. > > Unlike rx, tx does not need driver support to implement > copy avoidance, as pf_packet tx_ring already has this. > > Having to go through ndo_start_xmit does introduce other > overhead, notably skb alloc. Perhaps ndo_xdp_xmit is a > better choice (but I'm not very familiar with that). > > If some cost is inherent to a device-independent solution > and needs driver support to avoid it, then that can be added > in a follow-on patchset. But this one is large already without > the i40e tx patch.
Ideally, it would be best not having to introduce yet another xmit ndo. I believe ndo_xdp_xmit/ndo_xdp_flush would be the best fit, but we need to extend it with a destructor callback and potentially some kind of DMA trait. Why DMA? For zerocopy, we know the working set of packet buffers, so they are DMA mapped up front, whereas ndo_xdp_xmit does yet another DMA mapping. Paying for the DMA mapping in the fast-path is something we'd like to avoid.