On Fri, 7 Jun 2019 00:58:52 +0200
Stefano Brivio <sbri...@redhat.com> wrote:

> On Thu, 6 Jun 2019 22:37:11 +0000
> Martin Lau <ka...@fb.com> wrote:
> 
> > On Fri, Jun 07, 2019 at 12:17:47AM +0200, Stefano Brivio wrote:  
> > > On Thu, 6 Jun 2019 21:44:58 +0000
> > > Martin Lau <ka...@fb.com> wrote:
> > >     
> > > > > +     if (!(filter->flags & RTM_F_CLONED)) {
> > > > > +             err = rt6_fill_node(net, arg->skb, rt, NULL, NULL, 
> > > > > NULL, 0,
> > > > > +                                 RTM_NEWROUTE,
> > > > > +                                 NETLINK_CB(arg->cb->skb).portid,
> > > > > +                                 arg->cb->nlh->nlmsg_seq, flags);
> > > > > +             if (err)
> > > > > +                     return err;
> > > > > +     } else {
> > > > > +             flags |= NLM_F_DUMP_FILTERED;
> > > > > +     }
> > > > > +
> > > > > +     bucket = rcu_dereference(rt->rt6i_exception_bucket);
> > > > > +     if (!bucket)
> > > > > +             return 0;
> > > > > +
> > > > > +     for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) {
> > > > > +             hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) {
> > > > > +                     if (rt6_check_expired(rt6_ex->rt6i))
> > > > > +                             continue;
> > > > > +
> > > > > +                     err = rt6_fill_node(net, arg->skb, rt,
> > > > > +                                         &rt6_ex->rt6i->dst,
> > > > > +                                         NULL, NULL, 0, RTM_NEWROUTE,
> > > > > +                                         
> > > > > NETLINK_CB(arg->cb->skb).portid,
> > > > > +                                         arg->cb->nlh->nlmsg_seq, 
> > > > > flags);      
> > > > Thanks for the patch.
> > > > 
> > > > A question on when rt6_fill_node() returns -EMSGSIZE while dumping the
> > > > exception bucket here.  Where will the next inet6_dump_fib() start?    
> > > 
> > > And thanks for reviewing.
> > > 
> > > It starts again from the same node, see fib6_dump_node(): w->leaf = rt;
> > > where rt is the fib6_info where we failed dumping, so we won't skip
> > > dumping any node.    
> > If the same node will be dumped, does it mean that it will go through this
> > loop and iterate all exceptions again?  
> 
> Yes (well, all the exceptions for that node).
> 
> > > This also means that to avoid sending duplicates in the case where at
> > > least one rt6_fill_node() call goes through and one fails, we would
> > > need to track the last bucket and entry sent, or, alternatively, to
> > > make sure we can fit the whole node before dumping.    
> > My another concern is the dump may never finish.  
> 
> That's not a guarantee in general, even without this, because in theory
> the skb passed might be small enough that we can't even fit a single
> node without exceptions.
> 
> We could add a guard on w->leaf not being the same before and after the
> walk in inet6_dump_fib() and, if it is, terminate the dump. I just
> wonder if we have to do this at all -- I can't find this being done
> anywhere else (at a quick look at least).

I still can't convince myself this is an actual issue, but... somewhat
simpler: let's add a field to fib6_walker, that counts the entries
(both from FIB and exceptions) already dumped for the current node:

                res = rt6_dump_route(rt, w->args);
                if (res) {
                        /* Frame is full, suspend walking */
                        w->leaf = rt;
                        w->skip_node = res;
                        return 1;
                }

if the current leaf changes (tree changed), we reset it. And we use that
to skip rt6_fill_node() calls in rt6_dump_route(). What do you think?

-- 
Stefano

Reply via email to