On Sun, 6 Sep 2020 15:07:02 -0700 Michael Chan wrote: > On Sun, Sep 6, 2020 at 12:25 PM Jakub Kicinski <k...@kernel.org> wrote: > > > > devlink can itself scheduler a recovery via: > > > > bnxt_fw_fatal_recover() -> bnxt_fw_reset() > > > > Yes, this is how it is initiated when we call devlink_health_report() > to report the error condition. From bnxt_fw_reset(), we use a > workqueue because we have to go through many states, requiring > sleeping/polling to transition through the states. > > > no? Maybe don't make the devlink recovery path need to go via a > > workqueue? > > Current implementation is going through a work queue.
What I'm saying is the code looks like this after this patch: + clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); + bnxt_cancel_sp_work(bp); + bp->sp_event = 0; + bnxt_dl_fw_reporters_destroy(bp, true); It cancels the work, _then_ destroys the reporter. But I think the reported can be used to schedule a recovery from command line. So the work may get re-scheduled after it has been canceled. devlink_nl_cmd_health_reporter_recover_doit() -> bnxt_fw_fatal_recover() -> bnxt_fw_reset() -> bnxt_queue_fw_reset_work() What am I missing?