Hangbin Liu <[email protected]> wrote:

>When disabling a port’s collecting and distributing states, updating only
>rx_disabled is not sufficient. We also need to set AD_RX_PORT_DISABLED
>so that the rx_machine transitions into the AD_RX_EXPIRED state.
>
>One example is in ad_agg_selection_logic(): when a new aggregator is
>selected and old active aggregator is disabled, if AD_RX_PORT_DISABLED is
>not set, the disabled port may remain stuck in AD_RX_CURRENT due to
>continuing to receive partner LACP messages.

        I'm not sure I'm seeing the problem here, is there an actual
misbehavior being fixed here?  The port is receiving LACPDUs, and from
the receive state machine point of view (Figure 6-18) there's no issue.
The "port_enabled" variable (6.4.7) also informs the state machine
behavior, but that's not the same as what's changed by bonding's
__disable_port function.

        Where I'm going with this is that, when multiple aggregator
support was originally implemented, the theory was to keep aggregators
other than the active agg in a state such that they could be put into
service immediately, without having to do LACPDU exchanges in order to
transition into the appropriate state.  A hot standby, basically,
analogous to an active-backup mode backup interface with link state up.

        I haven't tested this in some time, though, so my question is
whether this change affects the failover time when an active aggregator
is de-selected in favor of another aggregator.  By "failover time," I
mean how long transmission and/or reception are interrupted when
changing from one aggregator to another.  I presume that if aggregator
failover ater this change requires LACPDU exchanges, etc, it will take
longer to fail over.

        -J


>The __disable_port() called by ad_disable_collecting_distributing()
>does not have this issue, since its caller also clears the
>collecting/distributing bits.
>
>The __disable_port() called by bond_3ad_bind_slave() should also be fine,
>as the RX state machine is re-initialized to AD_RX_INITIALIZE.
>
>Let's fix this only in ad_agg_selection_logic() to reduce the chances of
>unintended side effects.
>
>Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>Signed-off-by: Hangbin Liu <[email protected]>
>---
> drivers/net/bonding/bond_3ad.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index af7f74cfdc08..c47f6a69fd2a 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -1932,6 +1932,7 @@ static void ad_agg_selection_logic(struct aggregator 
>*agg,
>               if (active) {
>                       for (port = active->lag_ports; port;
>                            port = port->next_port_in_aggregator) {
>+                              port->sm_rx_state = AD_RX_PORT_DISABLED;
>                               __disable_port(port);
>                       }
>               }
>-- 
>2.50.1
>

---
        -Jay Vosburgh, [email protected]


Reply via email to