On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov <j...@ssi.bg> wrote: > Now the main question: is FIB_LOOKUP_NOREF used > everywhere in IPv4? I guess so. If not, it means > someone can walk its res->fi NHs which is bad. I think, > this will delay the unregistration for long time and we > can not solve the problem. > > If yes, free_fib_info() should not use call_rcu. > Instead, fib_release_info() will start RCU callback to > drop everything via a common function for fib_release_info > and free_fib_info. As result, the last fib_info_put will > just need to free fi->fib_metrics and fi.
Yes it is used. But this is a different problem from the dev refcnt issue, right? I can send a separate patch to address it. >> Are you sure we are safe to call dev_put() in fib_release_info() >> for _all_ paths, especially non-unregister paths? See: > > Yep, dev_put is safe there... > >> commit e49cc0da7283088c5e03d475ffe2fdcb24a6d5b1 >> Author: Yanmin Zhang <yanmin_zh...@linux.intel.com> >> Date: Wed May 23 15:39:45 2012 +0000 >> >> ipv4: fix the rcu race between free_fib_info and ip_route_output_slow > > ...as long as we do not set nh_dev to NULL > OK, fair enough, then I think the best solution here is to move the dev_put() from free_fib_info_rcu() to fib_release_info(), fib_nh is already removed from hash there anyway. diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index da449dd..cb712d1 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -205,8 +205,6 @@ static void free_fib_info_rcu(struct rcu_head *head) struct fib_info *fi = container_of(head, struct fib_info, rcu); change_nexthops(fi) { - if (nexthop_nh->nh_dev) - dev_put(nexthop_nh->nh_dev); lwtstate_put(nexthop_nh->nh_lwtstate); free_nh_exceptions(nexthop_nh); rt_fibinfo_free_cpus(nexthop_nh->nh_pcpu_rth_output); @@ -246,6 +244,14 @@ void fib_release_info(struct fib_info *fi) if (!nexthop_nh->nh_dev) continue; hlist_del(&nexthop_nh->nh_hash); + /* We have to release these nh_dev here because a dst + * could still hold a fib_info via rt->fi, we can't wait + * for GC, a socket could hold the dst for a long time. + * + * This is safe, dev_put() alone does not really free + * the netdevice, we just have to put the refcnt back. + */ + dev_put(nexthop_nh->nh_dev); } endfor_nexthops(fi) fi->fib_dead = 1; fib_info_put(fi); Thanks!