On Sun, 6 Sep 2020 15:07:02 -0700 Michael Chan wrote:
> On Sun, Sep 6, 2020 at 12:25 PM Jakub Kicinski <[email protected]> wrote:
> >
> > devlink can itself scheduler a recovery via:
> >
> > bnxt_fw_fatal_recover() -> bnxt_fw_reset()
> >
>
> Yes, this is how it is initiated when we call devlink_health_report()
> to report the error condition. From bnxt_fw_reset(), we use a
> workqueue because we have to go through many states, requiring
> sleeping/polling to transition through the states.
>
> > no? Maybe don't make the devlink recovery path need to go via a
> > workqueue?
>
> Current implementation is going through a work queue.
What I'm saying is the code looks like this after this patch:
+ clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
+ bnxt_cancel_sp_work(bp);
+ bp->sp_event = 0;
+
bnxt_dl_fw_reporters_destroy(bp, true);
It cancels the work, _then_ destroys the reporter. But I think the
reported can be used to schedule a recovery from command line. So the
work may get re-scheduled after it has been canceled.
devlink_nl_cmd_health_reporter_recover_doit() -> bnxt_fw_fatal_recover() ->
bnxt_fw_reset() -> bnxt_queue_fw_reset_work()
What am I missing?