Quoting Maxim Mikityanskiy (2021-03-15 15:53:02) > On 2021-03-15 10:38, Antoine Tenart wrote: > > Quoting Saeed Mahameed (2021-03-12 21:54:18) > >> There is a reason why it is conditional: > >> we had a bug in the past of double locking here: > >> > >> [ 4255.283960] echo/644 is trying to acquire lock: > >> > >> [ 4255.285092] ffffffff85101f90 (rtnl_mutex){+..}, at: > >> mlx5e_attach_netdev0xd4/0×3d0 [mlx5_core] > >> > >> [ 4255.287264] > >> > >> [ 4255.287264] but task is already holding lock: > >> > >> [ 4255.288971] ffffffff85101f90 (rtnl_mutex){+..}, at: > >> ipoib_vlan_add0×7c/0×2d0 [ib_ipoib] > >> > >> ipoib_vlan_add is called under rtnl and will eventually call > >> mlx5e_attach_netdev, we don't have much control over this in mlx5 > >> driver since the rdma stack provides a per-prepared netdev to attach to > >> our hw. maybe it is time we had a nested rtnl lock .. > > > > Thanks for the explanation. So as you said, we can't based the locking > > decision only on the driver own state / information... > > > > What about `take_rtnl = !rtnl_is_locked();`? > > It won't work, because the lock may be taken by some other unrelated > thread. By doing `if (!rtnl_is_locked()) rtnl_lock()` we defeat the > purpose of the lock, because we will proceed to the critical section > even if we should wait until some other thread releases the lock.
Ah, that's right...