On Thu, 13 Aug 2020 15:51:46 +0300 Shay Agroskin wrote: > Long answer: > The ena_destroy_device() function is called with rtnl_lock() held, > so it cannot run in parallel with the reset function. Also the > destroy function clears the bit ENA_FLAG_TRIGGER_RESET without > which the reset function just exits without doing anything. > > A problem can then only happen when some routine sets the > ENA_FLAG_TRIGGER_RESET bit before the reset function is executed, > the following describes all functions from which this bit can be > set:
ena_fw_reset_device() runs from a workqueue, it can be preempted right before it tries to take the rtnl_lock. Then after arbitrarily long delay it will start again, take the lock, and dereference adapter->flags. But adapter could have been long freed at this point. Unless you flush a workqueue or cancel_work_sync() you can never be sure it's not scheduled. And I can only see a flush when module is unloaded now.