On 13 Feb 2019, at 3:32, Magnus Karlsson wrote:
On Mon, Feb 11, 2019 at 9:44 PM Jonathan Lemon
<jonathan.le...@gmail.com> wrote:
On 8 Feb 2019, at 5:05, Magnus Karlsson wrote:
This patch proposes to add AF_XDP support to libbpf. The main reason
for this is to facilitate writing applications that use AF_XDP by
offering higher-level APIs that hide many of the details of the
AF_XDP
uapi. This is in the same vein as libbpf facilitates XDP adoption by
offering easy-to-use higher level interfaces of XDP
functionality. Hopefully this will facilitate adoption of AF_XDP,
make
applications using it simpler and smaller, and finally also make it
possible for applications to benefit from optimizations in the
AF_XDP
user space access code. Previously, people just copied and pasted
the
code from the sample application into their application, which is
not
desirable.
I like the idea of encapsulating the boilerplate logic in a library.
I do think there is an important missing piece though - there should
be
some code which queries the netdev for how many queues are attached,
and
create the appropriate number of umem/AF_XDP sockets.
I ran into this issue when testing the current AF_XDP code - on my
test
boxes, the mlx5 card has 55 channels (aka queues), so when the test
program
binds only to channel 0, nothing works as expected, since not all
traffic
is being intercepted. While obvious in hindsight, this took a while
to
track down.
Yes, agreed. You are not the first one to stumble upon this problem
:-). Let me think a little bit on how to solve this in a good way. We
need this to be simple and intuitive, as you say.
Has any investigation been done on using some variant of MPSC
implementation
as an intermediate form for AF_XDP? E.g.: something like LCRQ or the
bulkQ
in bpf devmap/cpumap. I'm aware that this would be slightly slower, as
it
would introduce a lock in the path, but I'd think that having DEVMAP,
CPUMAP
and XSKMAP all behave the same way would add more flexibility.
Ideally, if the configuration matches the underlying hardware, then the
implementation would reduce to the current setup (and allow ZC
implementations),
but a non-matching configuration would still work - as opposed to the
current
situation.
--
Jonathan