On Wed, Oct 02, 2019 at 08:17:59PM +0200, Jiri Pirko wrote: > Wed, Oct 02, 2019 at 10:40:48AM CEST, ido...@idosch.org wrote: > >From: Ido Schimmel <ido...@mellanox.com> > > > >Today, whenever an IPv4 route is added or deleted a notification is sent > >in the FIB notification chain and it is up to offload drivers to decide > >if the route should be programmed to the hardware or not. This is not an > >easy task as in hardware routes are keyed by {prefix, prefix length, > >table id}, whereas the kernel can store multiple such routes that only > >differ in metric / TOS / nexthop info. > > > >This series makes sure that only routes that are actually used in the > >data path are notified to offload drivers. This greatly simplifies the > >work these drivers need to do, as they are now only concerned with > >programming the hardware and do not need to replicate the IPv4 route > >insertion logic and store multiple identical routes. > > > >The route that is notified is the first FIB alias in the FIB node with > >the given {prefix, prefix length, table ID}. In case the route is > >deleted and there is another route with the same key, a replace > >notification is emitted. Otherwise, a delete notification is emitted. > > > >The above means that in the case of multiple routes with the same key, > >but different TOS, only the route with the highest TOS is notified. > >While the kernel can route a packet based on its TOS, this is not > >supported by any hardware devices I'm familiar with. Moreover, this is > >not supported by IPv6 nor by BIRD/FRR from what I could see. Offload > >drivers should therefore use the presence of a non-zero TOS as an > >indication to trap packets matching the route and let the kernel route > >them instead. mlxsw has been doing it for the past two years. > > > >The series also adds an "in hardware" indication to routes, in addition > > I think this might be a separate patchset. I mean patch "ipv4: Replace > route in list before notifying" and above.
OK. I mainly wanted to have it together in order to submit the tests with the patchset itself. I can split it into two.