> We dropped rtnl_lock to in fcoe_update_src_mac() to avoid the deadlock of rtnl
> mutex with fip controller lock in commit 949e71f. However, calling
> dev_uc_add/del() must hold rtnl_lock otherwise it could hit rtnl lock 
> assertion
> failure. Since we have to follow the order of grabbing fip lock after rtnl 
> lock
> since netdev event callback comes in w/ rtnl lock held, when update_mac() is
> called in libfcoe, we should drop fip lock first otherwise it will again get
> back to the deadlock situation. This issue is observed in VN2VN setup.
> 
> I don't like unlock/lock there, but we need the newly mac to be added before
> sending the claim otherwise there is a race that the response frame may come
> back before the mac is added to the underlying device's filter table so the
> frame may get dropped.
> 
> Signed-off-by: Yi Zou <[email protected]>
This RFC should be dropped and I need to think about more as it does not solve
all the issues. What facts I know are:

1. rtnl_lock() must be grabbed when dev_uc/mc_del/add() are called. It may not
hit rtnl assertion failure depending on if the underlying netdev has unicast 
filering
capability or not, if not, __deve_set_promiscuity() will be called that would 
fail
rtnl assertion. 

2. when rtnl mutex and fip->ctrl_mutex are nested w/, it has to be rtnl first, 
then
fip mutex second because we have netdev notficatier callback comes in w/ rtnl
already held.

3. one place that update_mac() is called in fip timer work context, which was 
what
RFC is attending to, it is also called in fcoe_ctlr_els_send() for non FIP 
mode, where
It may have a spinlock held already, i.e., can't grab any mutex there. It won't 
be exposed
unless in non FIP mode.

So, I'm afraid don't have a good solution to this yet, will have to rework this 
one.

Thanks,
yi

> ---
>  drivers/scsi/fcoe/fcoe.c      |    2 ++
>  drivers/scsi/fcoe/fcoe_ctlr.c |    2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
> index 666b7ac..02de3c9 100644
> --- a/drivers/scsi/fcoe/fcoe.c
> +++ b/drivers/scsi/fcoe/fcoe.c
> @@ -563,11 +563,13 @@ static void fcoe_update_src_mac(struct fc_lport
> *lport, u8 *addr)
>       struct fcoe_port *port = lport_priv(lport);
>       struct fcoe_interface *fcoe = port->priv;
> 
> +     rtnl_lock();
>       if (!is_zero_ether_addr(port->data_src_addr))
>               dev_uc_del(fcoe->netdev, port->data_src_addr);
>       if (!is_zero_ether_addr(addr))
>               dev_uc_add(fcoe->netdev, addr);
>       memcpy(port->data_src_addr, addr, ETH_ALEN);
> +     rtnl_unlock();
>  }
> 
>  /**
> diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c
> index 2ebe03a..485a1e7 100644
> --- a/drivers/scsi/fcoe/fcoe_ctlr.c
> +++ b/drivers/scsi/fcoe/fcoe_ctlr.c
> @@ -2760,7 +2760,9 @@ static void fcoe_ctlr_vn_timeout(struct fcoe_ctlr *fip)
>               hton24(mac, FIP_VN_FC_MAP);
>               hton24(mac + 3, new_port_id);
>               fcoe_ctlr_map_dest(fip);
> +             mutex_unlock(&fip->ctlr_mutex);
>               fip->update_mac(fip->lp, mac);
> +             mutex_lock(&fip->ctlr_mutex);
>               fcoe_ctlr_vn_send_claim(fip);
>               next_time = jiffies + msecs_to_jiffies(FIP_VN_ANN_WAIT);
>               break;
> 
> _______________________________________________
> devel mailing list
> [email protected]
> https://lists.open-fcoe.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
[email protected]
https://lists.open-fcoe.org/mailman/listinfo/devel

Reply via email to