Hello, On Thu, 11 May 2017, Cong Wang wrote:
> On Thu, May 11, 2017 at 5:07 PM, Cong Wang <xiyou.wangc...@gmail.com> wrote: > > So, if I understand you correctly it is safe to NULL'ing > > nh_dev in NETDEV_UNREGISTER_FINAL, right? > > > > If still not, how about transfer nh_dev's to loopback_dev > > too in NETDEV_UNREGISTER? Like we transfer dst->dev. > > > > I don't want to touch the fast path to check for NULL, as > > it will change more code and slow down performance. > > Finally I come up with the attached patch. Please let me know if > I still miss anything. fib_flush will unlink the FIB infos at NETDEV_UNREGISTER time, so we can not see them in any hash tables later on NETDEV_UNREGISTER_FINAL. fib_put_nh_devs() can not work except if moving NHs in another hash table but that should not be needed. I'm thinking for the following solution which is a bit hackish: - on NETDEV_UNREGISTER we want to put the nh_dev references, so fib_release_info is a good place. But as fib_release_info is not always called, we will have two places that put references. We can use such hack: - for example, use nh_oif to know if dev_put is already called - fib_release_info() should set nh_oif to 0 because it will now call dev_put without clearing nh_dev - free_fib_info_rcu() will not call dev_put if nh_oif is 0: if (nexthop_nh->nh_dev && nexthop_nh->nh_oif) dev_put(nexthop_nh->nh_dev); - as fi is unlinked, there is no chance fib_info_hashfn() to use the cleared nh_oif for hashing - we expect noone to touch nh_dev fields after fi is unlinked, this includes free_fib_info and free_fib_info_rcu What we achieve: - between NETDEV_UNREGISTER and NETDEV_UNREGISTER_FINAL nh_dev still points to our dev (and not to loopback, I think, this answers your question in previous email), so route lookups will not find loopback in the FIB tree. We do not want some packets to be wrongly rerouted. Even if we don't hold dev reference, the RCU grace period helps here. - fib_dev/nh_dev != NULL checks are not needed, so this addresses Eric's concerns. BTW, fib_route_seq_show actually checks for fi->fib_dev, may be not in a safe way (lockless_dereference needed?) but as we don't set nh_dev to NULL this is not needed. What more? What about nh_pcpu_rth_output and nh_rth_input holding routes? We should think about them too. I should think more if nh_oif trick can work for them, may be not because nh_oif is optional... May be we should purge them somehow? Regards -- Julian Anastasov <j...@ssi.bg>