Wed, Oct 02, 2019 at 10:40:48AM CEST, ido...@idosch.org wrote: >From: Ido Schimmel <ido...@mellanox.com> > >Today, whenever an IPv4 route is added or deleted a notification is sent >in the FIB notification chain and it is up to offload drivers to decide >if the route should be programmed to the hardware or not. This is not an >easy task as in hardware routes are keyed by {prefix, prefix length, >table id}, whereas the kernel can store multiple such routes that only >differ in metric / TOS / nexthop info. > >This series makes sure that only routes that are actually used in the >data path are notified to offload drivers. This greatly simplifies the >work these drivers need to do, as they are now only concerned with >programming the hardware and do not need to replicate the IPv4 route >insertion logic and store multiple identical routes. > >The route that is notified is the first FIB alias in the FIB node with >the given {prefix, prefix length, table ID}. In case the route is >deleted and there is another route with the same key, a replace >notification is emitted. Otherwise, a delete notification is emitted. > >The above means that in the case of multiple routes with the same key, >but different TOS, only the route with the highest TOS is notified. >While the kernel can route a packet based on its TOS, this is not >supported by any hardware devices I'm familiar with. Moreover, this is >not supported by IPv6 nor by BIRD/FRR from what I could see. Offload >drivers should therefore use the presence of a non-zero TOS as an >indication to trap packets matching the route and let the kernel route >them instead. mlxsw has been doing it for the past two years. > >The series also adds an "in hardware" indication to routes, in addition
I think this might be a separate patchset. I mean patch "ipv4: Replace route in list before notifying" and above. >to the offload indication we already have on nexthops today. Besides >being long overdue, the reason this is done in this series is that it >makes it possible to easily test the new FIB notification API over >netdevsim. > >To ensure there is no degradation in route insertion rates, I used >Vincent Bernat's script [1][2] from [3] to inject 500,000 routes from an >MRT dump from a router with a full view. On a system with Intel(R) >Xeon(R) CPU D-1527 @ 2.20GHz I measured 8.184 seconds, averaged over 10 >runs and saw no degradation compared to net-next from today. > >Patchset overview: >Patches #1-#7 introduce the new FIB notifications >Patches #8-#9 convert listeners to make use of the new notifications >Patches #10-#14 add "in hardware" indication for IPv4 routes, including >a dummy FIB offload implementation in netdevsim >Patch #15 adds a selftest for the new FIB notifications API over >netdevsim > >The series is based on Jiri's "devlink: allow devlink instances to >change network namespace" series [4]. The patches can be found here [5] >and patched iproute2 with the "in hardware" indication can be found here >[6]. > >IPv6 is next on my TODO list. > >[1] >https://github.com/vincentbernat/network-lab/blob/master/common/helpers/lab-routes-ipvX/insert-from-bgp >[2] https://gist.github.com/idosch/2eb96efe50eb5234d205e964f0814859 >[3] https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux >[4] https://patchwork.ozlabs.org/cover/1162295/ >[5] https://github.com/idosch/linux/tree/fib-notifier >[6] https://github.com/idosch/iproute2/tree/fib-notifier > >Ido Schimmel (15): > ipv4: Add temporary events to the FIB notification chain > ipv4: Notify route after insertion to the routing table > ipv4: Notify route if replacing currently offloaded one > ipv4: Notify newly added route if should be offloaded > ipv4: Handle route deletion notification > ipv4: Handle route deletion notification during flush > ipv4: Only Replay routes of interest to new listeners > mlxsw: spectrum_router: Start using new IPv4 route notifications > ipv4: Remove old route notifications and convert listeners > ipv4: Replace route in list before notifying > ipv4: Encapsulate function arguments in a struct > ipv4: Add "in hardware" indication to routes > mlxsw: spectrum_router: Mark routes as "in hardware" > netdevsim: fib: Mark routes as "in hardware" > selftests: netdevsim: Add test for route offload API > > .../net/ethernet/mellanox/mlx5/core/lag_mp.c | 4 - > .../ethernet/mellanox/mlxsw/spectrum_router.c | 152 ++----- > drivers/net/ethernet/rocker/rocker_main.c | 4 +- > drivers/net/netdevsim/fib.c | 263 ++++++++++- > include/net/ip_fib.h | 5 + > include/uapi/linux/rtnetlink.h | 1 + > net/ipv4/fib_lookup.h | 18 +- > net/ipv4/fib_semantics.c | 30 +- > net/ipv4/fib_trie.c | 223 ++++++++-- > net/ipv4/route.c | 12 +- > .../drivers/net/netdevsim/fib_notifier.sh | 411 ++++++++++++++++++ > 11 files changed, 938 insertions(+), 185 deletions(-) > create mode 100755 > tools/testing/selftests/drivers/net/netdevsim/fib_notifier.sh > >-- >2.21.0 >