On 6/4/19 3:06 PM, Martin Lau wrote: > On Tue, Jun 04, 2019 at 02:17:28PM -0600, David Ahern wrote: >> On 6/3/19 11:29 PM, Martin Lau wrote: >>> On Mon, Jun 03, 2019 at 07:36:06PM -0600, David Ahern wrote: >>>> On 6/3/19 6:58 PM, Martin Lau wrote: >>>>> I have concern on calling ip6_create_rt_rcu() in general which seems >>>>> to trace back to this commit >>>>> dec9b0e295f6 ("net/ipv6: Add rt6_info create function for >>>>> ip6_pol_route_lookup") >>>>> >>>>> This rt is not tracked in pcpu_rt, rt6_uncached_list or exception bucket. >>>>> In particular, how to react to NETDEV_UNREGISTER/DOWN like >>>>> the rt6_uncached_list_flush_dev() does and calls dev_put()? >>>>> >>>>> The existing callers seem to do dst_release() immediately without >>>>> caching it, but still concerning. >>>> >>>> those are the callers that don't care about the dst_entry, but are >>>> forced to deal with it. Removing the tie between fib lookups an >>>> dst_entry is again the right solution. >>> Great to know that there will be a solution. It would be great >>> if there is patch (or repo) to show how that may look like on >>> those rt6_lookup() callers. >> >> Not 'will be', 'there is' a solution now. Someone just needs to do the >> conversions and devise the tests for the impacted users. > I don't think everyone will convert to the new nexthop solution > immediately. > > How about ensuring the existing usage stays solid first?
Use of nexthop objects has nothing to do with separating fib lookups from dst_entries, but with the addition of fib6_result it Just Works. Wei converted ipv6 to use exception caches instead of adding them to the FIB. I converted ipv6 to use separate data structures for fib entries, added direct fib6 lookup functions and added fib6_result. See the net/core/filter.c. The stage is set for converting users. For example, ip6_nh_lookup_table does not care about the dst entry, only the fib entry. This converts it: static int ip6_nh_lookup_table(struct net *net, struct fib6_config *cfg, const struct in6_addr *gw_addr, u32 tbid, int flags, struct fib6_result *res) { struct flowi6 fl6 = { .flowi6_oif = cfg->fc_ifindex, .daddr = *gw_addr, .saddr = cfg->fc_prefsrc, }; struct fib6_table *table; struct rt6_info *rt; table = fib6_get_table(net, tbid); if (!table) return -EINVAL; if (!ipv6_addr_any(&cfg->fc_prefsrc)) flags |= RT6_LOOKUP_F_HAS_SADDR; flags |= RT6_LOOKUP_F_IGNORE_LINKSTATE; fib6_table_lookup(net, table, cfg->fc_ifindex, fl6, res, flags); if (res.f6i == net->ipv6.fib6_null_entry) return -ENETUNREACH; fib6_select_path(net, &res, fl6, oif, false, NULL, flags); return 0; }