From: Michael Chan <[email protected]>
Date: Sun, 25 Aug 2019 23:54:59 -0400
> @@ -9234,6 +9243,13 @@ int bnxt_close_nic(struct bnxt *bp, bool irq_re_init,
> bool link_re_init)
> {
> int rc = 0;
>
> + while (test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) {
> + netdev_info(bp->dev, "FW reset in progress, delaying close");
> + rtnl_unlock();
> + msleep(250);
> + rtnl_lock();
> + }
Dropping the RTNL here is extremely dangerous.
Operations other than actual device close can get into the
bnxt_close_nic() code paths (changing features, ethtool ops, etc.)
So we can thus re-enter this function once you drop the RTNL mutex.
Furthermore, and I understand what pains you go into in patch #9 to
avoid this, but it's an endless loop. If there are bugs there, we
will get stuck in this close path forever.