On Wed, May 10, 2017 at 12:38 AM, Julian Anastasov <j...@ssi.bg> wrote: > > Hello, > > On Tue, 9 May 2017, Cong Wang wrote: > >> > Also setting nexthop_nh->nh_dev to NULL looks quite dangerous >> > >> > We have plenty of sites doing : >> > >> > if (fi->fib_dev) >> > x = fi->fib_dev->field >> > >> > fib_route_seq_show() is one example. >> > >> >> All of them take RCU read lock, so, as I explained in the code comment, >> they all should be fine because of synchronize_net() on unregister path. >> Do you see anything otherwise? > > During NETDEV_UNREGISTER packets for dev should not > be flying but packets for other devs can walk the nexthops > for multipath routes. It is the rcu_barrier before > NETDEV_UNREGISTER_FINAL that allows nh_dev to be set to NULL > during this grace period but there are many places to fix that > assume nh_dev!=NULL.
Excellent point. Unfortunately NETDEV_UNREGISTER_FINAL is not yet captured by the fib event notifier. I think readers are still safe to assume nh_dev!=NULL since we wait for existing readers, readers coming later can't see it any more. Or am I missing anything? > > But why we leak routes? Is there some place that holds > routes without listening for NETDEV_UNREGISTER? On fib_flush > the infos are unlinked from trees, so after a grace period packets > should not see/hold such infos. If we hold routes somewhere for > long time, problem can happen also for routes with single nexthop. My theory is that dst which holds a refcnt to fib_info (of course, after this patch) is moved in GC after NETDEV_UNREGISTER but still referenced somewhere, it therefore holds these nh_dev's, which in turn hold back to the netdevice which is being unregistered, thus Eric saw these refcnt leak warnings. I am not sure if sitting in GC for a long time is problematic or not, but from the code where we transfer dst->dev to loopback_dev, it seems this is expected otherwise no need to transfer? But I don't dig the history though. Thanks.