On 2017-11-02 9:11 PM, Jay Vosburgh wrote:
Alex Sidorenko <alexandre.sidore...@hpe.com> wrote:
...> I think I see the flaw in the logic.
1) bond_miimon_inspect finds link_state = 0, then makes a call
to bond_propose_link_state(BOND_LINK_FAIL), setting link_new_state to
BOND_LINK_FAIL. _inspect then sets slave->new_link = BOND_LINK_DOWN and
returns non-zero.
2) bond_mii_monitor rtnl_trylock fails, it reschedules.
3) bond_mii_monitor runs again, and calls bond_miimon_inspect.
4) the slave's link has recovered, so link_state != 0.
slave->link is still BOND_LINK_UP. The slave's link_new_state remains
set to BOND_LINK_FAIL, but new_link is reset to NOCHANGE.
bond_miimon_inspect returns 0, so nothing is committed.
5) step 4 can repeat indefinitely.
6) eventually, the other slave does something that causes
commit++, making bond_mii_monitor call bond_commit_link_state and then
bond_miimon_commit. The slave in question from steps 1-4 still has
link_new_state as BOND_LINK_FAIL, but new_link is NOCHANGE, so it ends
up in BOND_LINK_FAIL state.
I think step 6 could also occur concurrently with the initial
pass through step 4 to induce the problem.
It looks like Mahesh mostly fixed this in
commit fb9eb899a6dc663e4a2deed9af2ac28f507d0ffb
Author: Mahesh Bandewar <mahe...@google.com>
Date: Tue Apr 11 22:36:00 2017 -0700
bonding: handle link transition from FAIL to UP correctly
but the window still exists, and requires the slave link state
to change between the failed rtnl_trylock and the second pass through
_inspect. The problem is that a state transition has been kept between
invocations to _inspect, but the condition that induced the transition
has changed.
I haven't tested these, but I suspect the solution is either to
clear link_new_state on entry to the loop in bond_miimon_inspect, or
merge new_state and link_new_state as a single "next state" (which is
cleared on entry to the loop).
The first of these is a pretty simple patch:
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 18b58e1376f1..6f89f9981a6c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2046,6 +2046,7 @@ static int bond_miimon_inspect(struct bonding *bond)
bond_for_each_slave_rcu(bond, slave, iter) {
slave->new_link = BOND_LINK_NOCHANGE;
+ slave->link_new_state = slave->link;
link_state = bond_check_dev_link(bond, slave->dev, 0);
Alex / Jarod, could you check my logic, and would you be able to
test this patch if my analysis appears sound?
This patch looks good, the original reproducing setup successfully
recovers after the original active slave goes down, even with
NetworkManager in the mix.
Reviewed-by: Jarod Wilson <ja...@redhat.com>
--
Jarod Wilson
ja...@redhat.com