On 4 Sep 2019, at 0:39, Eric Dumazet wrote:
On 9/3/19 11:55 PM, Jonathan Lemon wrote:
How appropriate is it to hold the rtnl_lock() across a sleepable
memory allocation? On one hand it's just a mutex, but it would
seem like it could block quite a few things.
Sure, all GFP_KERNEL allocations can sleep for quite a while.
On the other hand, we may want to delay stuff if memory is under
pressure,
or complex operations like NEWLINK would fail.
RTNL is mostly taken for control path operations, we prefer them to be
mostly reliable, otherwise admins job would be a nightmare.
In some cases, it is relatively easy to pre-allocate memory before
rtnl is taken,
but that will only take care of some selected paths.
The particular code path that I'm looking at is mlx5e_tx_timeout_work().
This is called on TX timeout, and mlx5 wants to move an entire channel
and all the supporting structures elsewhere. Under the rtnl_lock(), it
calls kvzmalloc() in order to grab a large chunk of contig memory, which
ends up stalling the system.
I suspect these large allocation should really be done outside the lock.
--
Jonathan