On Wed, 2019-09-04 at 09:38 -0700, Jonathan Lemon wrote: > On 4 Sep 2019, at 0:39, Eric Dumazet wrote: > > > On 9/3/19 11:55 PM, Jonathan Lemon wrote: > > > How appropriate is it to hold the rtnl_lock() across a sleepable > > > memory allocation? On one hand it's just a mutex, but it would > > > seem like it could block quite a few things. > > > > > > > Sure, all GFP_KERNEL allocations can sleep for quite a while. > > > > On the other hand, we may want to delay stuff if memory is under > > pressure, > > or complex operations like NEWLINK would fail. > > > > RTNL is mostly taken for control path operations, we prefer them to > > be > > mostly reliable, otherwise admins job would be a nightmare. > > > > In some cases, it is relatively easy to pre-allocate memory before > > rtnl is taken, > > but that will only take care of some selected paths. > > The particular code path that I'm looking at is > mlx5e_tx_timeout_work(). > > This is called on TX timeout, and mlx5 wants to move an entire > channel > and all the supporting structures elsewhere. Under the rtnl_lock(), > it > calls kvzmalloc() in order to grab a large chunk of contig memory, > which > ends up stalling the system. > > I suspect these large allocation should really be done outside the > lock.
I am afraid that is impossible, at least not for all allocations some allocations require parameters that should remain valid and constant across the whole reconfiguration procedure such params.num_channels, so they must be done inside the lock. other allocations are buried deep inside mlx5 that by doing pre allocations is going to require a lot of refactoring. One idea is to use some sort of mem cache specifically for mlx5 reconfiguration that is cheaper to call than raw kvzalloc ? but different objects in the mlx5 reconfiguration path requires differnt memory types, numa affinity etc.. which might make the cache harder to satisfy all requirements.