On Mon, Nov 30, 2020 at 10:00:16PM +0100, Eric Dumazet wrote: > On Mon, Nov 30, 2020 at 9:50 PM Vladimir Oltean <olte...@gmail.com> wrote: > > > > On Mon, Nov 30, 2020 at 09:43:01PM +0100, Eric Dumazet wrote: > > > Understood, but really dev_base_lock can only be removed _after_ we > > > convert all usages to something else (mutex based, and preferably not > > > the global RTNL) > > > > Sure. > > A large part of getting rid of dev_base_lock seems to be just: > > - deleting the bogus usage from mlx4 infiniband and friends > > - converting procfs, sysfs and friends to netdev_lists_mutex > > - renaming whatever is left into something related to the RFC 2863 > > operstate. > > > > > Focusing on dev_base_lock seems a distraction really. > > > > Maybe. > > But it's going to be awkward to explain in words what the locking rules > > are, when the read side can take optionally the dev_base_lock, RCU, or > > netdev_lists_lock, and the write side can take optionally the dev_base_lock, > > RTNL, or netdev_lists_lock. Not to mention that anybody grepping for > > dev_base_lock will see the current usage and not make a lot out of it. > > > > I'm not really sure how to order this rework to be honest. > > We can not have a mix of RCU /rwlock/mutex. It must be one, because of > bonding/teaming. > > So all existing uses of rwlock / RCU need to be removed. > > This is probably not trivial.
Now, "it's going to look nasty" is one thing, whereas "it won't work" is completely different. I think it would work though, so could you expand on why you're saying we can't have the mix? dev_change_name(), list_netdevice() and unlist_netdevice() just need to take one more layer of locking. The new netdev_lists_mutex would serve as a temporary alternative to the RTNL mutex. Then we could gradually replace more and more of the RTNL mutex with netdev_lists_mutex. The bonding driver can certainly use the netdev_lists_mutex. It guarantees protection against the three functions mentioned above, and it is sleepable, and it is not the RTNL mutex. So can procfs and sysfs. Am I missing something? > Perhaps you could add a temporary ndo_get_sleepable_stats64() so that > drivers can be converted one at a time. Yeah, been there, Jakub doesn't like it.