Re: [RFC feedback] AF_XDP and non-Intel hardware

Luke Gorrie Tue, 22 May 2018 02:15:28 -0700

On 21 May 2018 at 20:55, Björn Töpel <bjorn.to...@gmail.com> wrote:
>
> 2018-05-21 14:34 GMT+02:00 Mykyta Iziumtsev <mykyta.iziumt...@linaro.org>:
> > Hi Björn and Magnus,
> >
> > (This thread is a follow up to private dialogue. The intention is to
> > let community know that AF_XDP can be enhanced further to make it
> > compatible with wider range of NIC vendors).
> >
>
> Mykyta, thanks for doing the write-up and sending it to the netdev
> list! The timing could not be better -- we need to settle on an uapi
> that works for all vendors prior enabling it in the kernel.


[Resending with vger-compatible formatting.]

So! The discussion here seems to be about how to make the XDP uapi
accommodate all hardware vendors but I wanted to chime in with a
userspace application developer perspective (remember us? ;-))

These days more and more people understand the weird and wonderful
ways that NICs want to deal with packet memory. Scatter-gather lists;
typewriter buffers; payload inline in descriptors; metadata inline in
payload; constraints on buffer size; constraints on buffer alignment;
etc; etc; etc.

How about userspace applications though? We also have our own ideas
about the ways that things should be done. I think there is a
fundamental tension here: the more flexibility you provide to
hardware, the more constraints you impose on applications, and vice
versa.

To be concrete let me explain the peculiar way that we handle packet
memory in the Snabb application.  Snabb uses a simple representation
of packets in memory:

  struct packet {
    uint16_t length;
    unsigned char data[10 * 1024];
  }

and a special allocator so that the virtual address of each packet:

- Is identical in every process that can share traffic;
- Can be mapped on demand (via SIGSEGV fault handler);
- Can be used to calculate the DMA (physical) address;
- Can be used to calculate how much headroom is available.

So our scheme is fairly nuanced. Just now this seems to fit well with
most NICs, which allow scatter-gather operation from memory allocated
independently by the application, but we have to resolve an impedence
mismatch (copy) for e.g. typewriter model. Overall this situation is
quite acceptable.

How would this fit with the XDP uapi though? Can we preserve these
properties of our packets and make them XDP-compatible? The ideal for
us would probably be to replace the code that allocates a HugeTLB for
packet data with an equivalent that allocates a chunk of
XDP-compatible memory that we can slice up and mremap to suit our taste.

If that is not possible then I see a couple of alternatives:

One would be to drop all of our invariants on packet addresses and
switch to a more middle-of-the-road design that puts everything inline
into the packet (an "sk_buff-alike.") Then we would outsource all the
allocation to the kernel, which would do it specially to suit the
hardware from $VENDOR. (And hopefully deal somehow with mixing traffic
from $OTHERVENDOR too, etc.)

The other alternative would be to preserve our current packet
structure and introduce a copy into separate XDP memory on
transmit/receive. This is the approach that we take today with
vhost-user and is the approach we would take if we supported a
"typewriter" style NIC too.

I'm not immediately wild about either of those options though, and I
am not sure how keen the next wave of application developers turning up
over the next 5-10 years and "doing it our way" will be either.

So, anyway, that is my braindump on trying to understand how suitable
XDP would be for us as application developers, and how much of this
depends on the fine details that are being discussed on this thread.
I hope this perspective is a useful complement to the feedback from
hardware makers.

Cheers,
-Luke

Re: [RFC feedback] AF_XDP and non-Intel hardware

Reply via email to