Wed, May 25, 2016 at 09:21:52PM CEST, ivec...@redhat.com wrote:
>The team_device_event() notifier calls team_compute_features() to fix
>vlan_features under team->lock to protect team->port_list. The problem is
>that subsequent __team_compute_features() calls netdev_change_features()
>to propagate vlan_features to upper vlan devices while team->lock is still
>taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
>devices or team device itself.
>
>Example:
>The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
>LRO capable and LRO is enabled. Thus LRO is also enabled on team0.
>
>The command 'ethtool -K team0 lro off' now hangs due to this deadlock:
>
>dev_ethtool()
>-> ethtool_set_features()
> -> __netdev_update_features(team)
>  -> netdev_sync_lower_features()
>   -> netdev_update_features(lower_1)
>    -> __netdev_update_features(lower_1)
>    -> netdev_features_change(lower_1)
>     -> call_netdevice_notifiers(...)
>      -> team_device_event(lower_1)
>       -> team_compute_features(team) [TAKES team->lock]
>        -> netdev_change_features(team)
>         -> __netdev_update_features(team)
>          -> netdev_sync_lower_features()
>           -> netdev_update_features(lower_2)
>            -> __netdev_update_features(lower_2)
>            -> netdev_features_change(lower_2)
>             -> call_netdevice_notifiers(...)
>              -> team_device_event(lower_2)
>               -> team_compute_features(team) [DEADLOCK]
>
>The bug is present in team from the beginning but it appeared after the commit
>fd867d5 (net/core: generic support for disabling netdev features down stack)
>that adds synchronization of features with lower devices.
>
>Fixes: fd867d5 (net/core: generic support for disabling netdev features down 
>stack)
>Cc: Jiri Pirko <j...@resnulli.us>
>Signed-off-by: Ivan Vecera <ivec...@redhat.com>

Signed-off-by: Jiri Pirko <j...@mellanox.com>

Reply via email to