On Oct 19, 2020, at 2:52 PM, Sukadev Bhattiprolu
<[1]suka...@linux.ibm.com> wrote:
From 67f8977f636e462a1cd1eadb28edd98ef4f2b756 Mon Sep 17 00:00:00
2001
From: Sukadev Bhattiprolu <[2]suka...@linux.vnet.ibm.com>
Date: Thu, 10 Sep 2020 11:18:41 -0700
Subject: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
Commit 5a18e1e0c193b introduced the 'failover_pending' state to
track
the "failover pending window" - where we wait for the partner to
become
ready (after a transport event) before actually attempting to
failover.
i.e window is between following two events:
a. we get a transport event due to a FAILOVER
b. later, we get CRQ_INITIALIZED indicating the partner is
ready at which point we schedule a FAILOVER reset.
and ->failover_pending is true during this window.
If during this window, we attempt to open (or close) a device, we
pretend
that the operation succeded and let the FAILOVER reset path
complete
the
operation.
This is fine, except if the transport event ("a" above) occurs
during
the
open and after open has already checked whether a failover is
pending.
If
that happens, we fail the open, which can cause the boot scripts to
leave
the interface down requiring administrator to manually bring up the
device.
This fix "extends" the failover pending window till we are
_actually_
ready to perform the failover reset (i.e until after we get the
RTNL
lock). Since open() holds the RTNL lock, we can be sure that we
either
finish the open or if the open() fails due to the failover pending
window,
we can again pretend that open is done and let the failover
complete
it.
Signed-off-by: Sukadev Bhattiprolu <[3]suka...@linux.ibm.com>
---
Changelog [v2]:
[Brian King] Ensure we clear failover_pending during hard reset
---
drivers/net/ethernet/ibm/ibmvnic.c | 36
++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
b/drivers/net/ethernet/ibm/ibmvnic.c
index 1b702a43a5d0..2a0f6f6820db 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1197,18 +1197,27 @@ static int ibmvnic_open(struct net_device
*netdev)
if (adapter->state != VNIC_CLOSED) {
rc = ibmvnic_login(netdev);
if (rc)
- return rc;
+ goto out;
rc = init_resources(adapter);
if (rc) {
netdev_err(netdev, "failed to initialize resources\n");
release_resources(adapter);
- return rc;
+ goto out;
}
}
rc = __ibmvnic_open(netdev);
+out:
+ /*
+ * If open fails due to a pending failover, set device state and
+ * return. Device operation will be handled by reset routine.
+ */
+ if (rc && adapter->failover_pending) {
+ adapter->state = VNIC_OPEN;
+ rc = 0;
+ }
return rc;
}
@@ -1931,6 +1940,13 @@ static int do_reset(struct ibmvnic_adapter
*adapter,
rwi->reset_reason);
rtnl_lock();
+ /*
+ * Now that we have the rtnl lock, clear any pending failover.
+ * This will ensure ibmvnic_open() has either completed or will
+ * block until failover is complete.
+ */
+ if (rwi->reset_reason == VNIC_RESET_FAILOVER)
+ adapter->failover_pending = false;
netif_carrier_off(netdev);
adapter->reset_reason = rwi->reset_reason;
@@ -2211,6 +2227,13 @@ static void __ibmvnic_reset(struct
work_struct
*work)
/* CHANGE_PARAM requestor holds rtnl_lock */
rc = do_change_param_reset(adapter, rwi, reset_state);
} else if (adapter->force_reset_recovery) {
+ /*
+ * Since we are doing a hard reset now, clear the
+ * failover_pending flag so we don't ignore any
+ * future MOBILITY or other resets.
+ */
+ adapter->failover_pending = false;
+
I think it would be better to put above chunk of code to
do_hard_reset()
like you do for do_reset(), if you really want to extend the
window