On 6/20/19 1:01 PM, Stefano Brivio wrote: >>> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c >>> index 94e5d83db4db..03f51e5192e5 100644 >>> --- a/net/ipv4/fib_trie.c >>> +++ b/net/ipv4/fib_trie.c >>> @@ -2078,28 +2078,51 @@ void fib_free_table(struct fib_table *tb) >>> call_rcu(&tb->rcu, __trie_free_rcu); >>> } >>> >>> +static int fib_dump_fnhe_from_leaf(struct fib_alias *fa, struct sk_buff >>> *skb, >>> + struct netlink_callback *cb, >>> + int *fa_index, int fa_start) >>> +{ >>> + struct fib_info *fi = fa->fa_info; >>> + int nhsel; >>> + >>> + if (!fi || fi->fib_flags & RTNH_F_DEAD) >>> + return 0; >>> + >>> + for (nhsel = 0; nhsel < fib_info_num_path(fi); nhsel++) { >>> + int err; >>> + >>> + err = fnhe_dump_buckets(fa, nhsel, skb, cb, fa_index, fa_start); >>> + if (err) >>> + return err; >>> + } >>> + >>> + return 0; >>> +} >> >> fib_info would be the better argument to pass in to the fnhe dump, and > > ...we need to pass the table ID to rt_fill_info(). Sure, I can pass > that explicitly, but doing so kind of tells me I'm not passing the > right argument, with sufficient information. What do you think?
I think that is preferable to passing fib_alias. > >> I think the loop over where the bucket is should be in route.c as well. >> So how about fib_info_dump_fnhe() as the helper exposed from route.c, >> and it does the loop over nexthops and calls fnhe_dump_buckets. > > Yes, I could do that conveniently if I'm passing a fib_info there. I'm > stlll undecided if it's worth it, I guess I don't really have a > preference. > >> As for the loop, you could fill an skb without finishing a bucket inside >> of a nexthop so you need top track which nexthop is current as well. > > I think this is not a problem, and also checked that selftests trigger > this. Buckets are transparent to the counter for partial dumps (s_fa), > they are just an arbitrary grouping from that perspective, just like > items on the chain for the same bucket. > > Take this example, s_i values in [], s_fa values in (): > > node (fa) #1 [1] > nexthop #1 > bucket #1 -> #0 in chain (1) > bucket #2 -> #0 in chain (2) -> #1 in chain (3) -> #2 in chain (4) > bucket #3 -> #0 in chain (5) -> #1 in chain (6) > > nexthop #2 > bucket #1 -> #0 in chain (7) -> #1 in chain (8) > bucket #2 -> #0 in chain (9) > -- > node (fa) #2 [2] > nexthop #1 > bucket #1 -> #0 in chain (1) -> #1 in chain (2) > bucket #2 -> #0 in chain (3) > > > If I stop at (3), (4), (7) for "node #1", or at (2) for "node #2", it > doesn't really matter, because nexthops and buckets are always > traversed in the same way (walking flattens all that). > > For IPv4, I could even drop the in-tree/in-node distinction (s_i/s_fa). > But accounting becomes terribly inconvenient though, and it would be > inconsistent with what I needed to do for IPv6 (skip/skip_in_node): we > have 'sernum' there, which is used to mark what node we need to restart > from in case of changes. Within a node, however, I can't make any > assumptions like that, so if the fib6 tree changes, I'll restart from > the beginning of the node (see discussion with Martin on v1). > > My idea would be to keep it like it is at the moment, and later make it > as "accurate" as it is on IPv6, introducing something like 'sernum'. If > we start with this, it will be more convenient to do that later. > ok, if you have it covered. can you add that description above to the commit message. Be good to capture how it is covered.