> We dropped rtnl_lock to in fcoe_update_src_mac() to avoid the deadlock of rtnl > mutex with fip controller lock in commit 949e71f. However, calling > dev_uc_add/del() must hold rtnl_lock otherwise it could hit rtnl lock > assertion > failure. Since we have to follow the order of grabbing fip lock after rtnl > lock > since netdev event callback comes in w/ rtnl lock held, when update_mac() is > called in libfcoe, we should drop fip lock first otherwise it will again get > back to the deadlock situation. This issue is observed in VN2VN setup. > > I don't like unlock/lock there, but we need the newly mac to be added before > sending the claim otherwise there is a race that the response frame may come > back before the mac is added to the underlying device's filter table so the > frame may get dropped. > > Signed-off-by: Yi Zou <[email protected]> This RFC should be dropped and I need to think about more as it does not solve all the issues. What facts I know are:
1. rtnl_lock() must be grabbed when dev_uc/mc_del/add() are called. It may not hit rtnl assertion failure depending on if the underlying netdev has unicast filering capability or not, if not, __deve_set_promiscuity() will be called that would fail rtnl assertion. 2. when rtnl mutex and fip->ctrl_mutex are nested w/, it has to be rtnl first, then fip mutex second because we have netdev notficatier callback comes in w/ rtnl already held. 3. one place that update_mac() is called in fip timer work context, which was what RFC is attending to, it is also called in fcoe_ctlr_els_send() for non FIP mode, where It may have a spinlock held already, i.e., can't grab any mutex there. It won't be exposed unless in non FIP mode. So, I'm afraid don't have a good solution to this yet, will have to rework this one. Thanks, yi > --- > drivers/scsi/fcoe/fcoe.c | 2 ++ > drivers/scsi/fcoe/fcoe_ctlr.c | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c > index 666b7ac..02de3c9 100644 > --- a/drivers/scsi/fcoe/fcoe.c > +++ b/drivers/scsi/fcoe/fcoe.c > @@ -563,11 +563,13 @@ static void fcoe_update_src_mac(struct fc_lport > *lport, u8 *addr) > struct fcoe_port *port = lport_priv(lport); > struct fcoe_interface *fcoe = port->priv; > > + rtnl_lock(); > if (!is_zero_ether_addr(port->data_src_addr)) > dev_uc_del(fcoe->netdev, port->data_src_addr); > if (!is_zero_ether_addr(addr)) > dev_uc_add(fcoe->netdev, addr); > memcpy(port->data_src_addr, addr, ETH_ALEN); > + rtnl_unlock(); > } > > /** > diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c > index 2ebe03a..485a1e7 100644 > --- a/drivers/scsi/fcoe/fcoe_ctlr.c > +++ b/drivers/scsi/fcoe/fcoe_ctlr.c > @@ -2760,7 +2760,9 @@ static void fcoe_ctlr_vn_timeout(struct fcoe_ctlr *fip) > hton24(mac, FIP_VN_FC_MAP); > hton24(mac + 3, new_port_id); > fcoe_ctlr_map_dest(fip); > + mutex_unlock(&fip->ctlr_mutex); > fip->update_mac(fip->lp, mac); > + mutex_lock(&fip->ctlr_mutex); > fcoe_ctlr_vn_send_claim(fip); > next_time = jiffies + msecs_to_jiffies(FIP_VN_ANN_WAIT); > break; > > _______________________________________________ > devel mailing list > [email protected] > https://lists.open-fcoe.org/mailman/listinfo/devel _______________________________________________ devel mailing list [email protected] https://lists.open-fcoe.org/mailman/listinfo/devel
