On 16-04-08 12:47 AM, Brenden Blanco wrote:
This patch set introduces new infrastructure for programmatically
processing packets in the earliest stages of rx, as part of an effort
others are calling Express Data Path (XDP) [1]. Start this effort by
introducing a new bpf program type for early packet filtering, before even
an skb has been allocated.

With this, hope to enable line rate filtering, with this initial
implementation providing drop/allow action only.

Patch 1 introduces the new prog type and helpers for validating the bpf
program. A new userspace struct is defined containing only len as a field,
with others to follow in the future.
In patch 2, create a new ndo to pass the fd to support drivers.
In patch 3, expose a new rtnl option to userspace.
In patch 4, enable support in mlx4 driver. No skb allocation is required,
instead a static percpu skb is kept in the driver and minimally initialized
for each driver frag.
In patch 5, create a sample drop and count program. With single core,
achieved ~20 Mpps drop rate on a 40G mlx4. This includes packet data
access, bpf array lookup, and increment.

Hrm. This doesnt sound very high (less than 50%?).
Is the driver the main overhead?
I'd be curious, for comparison, if you just dropped everything
without bpf and alternatively with tc + bpf of the same program
on the one cpu.
Numbers we had for the NUC with tc on single core were a bit higher
than 20Mpps but there was no driver overhead - so i expected
to see much higher numbers if you did it at the driver...
Note back in the day Alexey(not Alexei;->) had a built-in driver
level forwarder;
however the advantage there was derived out of packets being DMAed
from ingress to egress port after some simple lookup.

cheers,
jamal

Reply via email to