On 1/17/17 1:54 PM, David Miller wrote: > From: David Ahern <d...@cumulusnetworks.com> > Date: Tue, 17 Jan 2017 13:46:22 -0700 > >> In short seems like removing the dev + the current patch dropping >> the lock fixes the current deadlock problem and should be fine. > > What about the state recorded by fib_get_nhs() and similar? There is > a mapping from ifindex to ->nh_dev which would be invalidated if the > RTNL semaphore is dropped.
As far as I can see through the call to build_state all device indices came from the user and have not been validated yet (once the dev arg to build_state is removed; sent that patch for net-next). The device index validation happens later in fib_create_info with the call to fib_check_nh (or dev_get_by_index for host scope). I sent an alternative approach that pulls the module loading into a separate function that is called while creating the fib_config. Performance heavy for multipath but solves the autoload without delving into the restart problem.