On Tue, 2021-01-12 at 10:14 -0800, Sukadev Bhattiprolu wrote: > The work queue is used to queue reset requests like CHANGE-PARAM or > FAILOVER resets for the worker thread. When the adapter is being > removed > the adapter state is set to VNIC_REMOVING and the work queue is > flushed > so no new work is added. However the check for adapter being removed > is > racy in that the adapter can go into REMOVING state just after we > check > and we might end up adding work just as it is being flushed (or > after). > > Use a new spin lock, ->remove_lock to ensure that after (while) the > work > queue is (being) flushed, no new work is queued. > > A follow-on patch will convert the existing ->state_lock from a spin > lock > to to a mutex to allow us to hold it for longer periods of time. > Since > work can be scheduled from a tasklet, we cannot block on the mutex, > so > use a new spin lock. > > Fixes: 6954a9e4192b ("ibmvnic: Flush existing work items before > device removal") > Signed-off-by: Sukadev Bhattiprolu <suka...@linux.ibm.com> > --- > drivers/net/ethernet/ibm/ibmvnic.c | 15 ++++++++++++++- > drivers/net/ethernet/ibm/ibmvnic.h | 2 ++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > b/drivers/net/ethernet/ibm/ibmvnic.c > index ad551418ac63..7645df37e886 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -2435,6 +2435,7 @@ static int ibmvnic_reset(struct ibmvnic_adapter > *adapter, > enum ibmvnic_reset_reason reason) > { > struct net_device *netdev = adapter->netdev; > + unsigned long flags; > int ret; > > if (adapter->state == VNIC_PROBING) { > @@ -2443,6 +2444,8 @@ static int ibmvnic_reset(struct ibmvnic_adapter > *adapter, > goto err; > } > > + spin_lock_irqsave(&adapter->remove_lock, flags); > + > /* > * If failover is pending don't schedule any other reset. > * Instead let the failover complete. If there is already a > @@ -2465,8 +2468,9 @@ static int ibmvnic_reset(struct ibmvnic_adapter > *adapter, > netdev_dbg(adapter->netdev, "Scheduling reset (reason %d)\n", > reason); > schedule_work(&adapter->ibmvnic_reset); > > - return 0; > + ret = 0; > err: > + spin_unlock_irqrestore(&adapter->remove_lock, flags); > return -ret; > } > > @@ -5387,6 +5391,7 @@ static int ibmvnic_probe(struct vio_dev *dev, > const struct vio_device_id *id) > memset(&adapter->pending_resets, 0, sizeof(adapter- > >pending_resets)); > spin_lock_init(&adapter->rwi_lock); > spin_lock_init(&adapter->state_lock); > + spin_lock_init(&adapter->remove_lock); > mutex_init(&adapter->fw_lock); > init_completion(&adapter->init_done); > init_completion(&adapter->fw_done); > @@ -5467,7 +5472,15 @@ static int ibmvnic_remove(struct vio_dev *dev) > return -EBUSY; > } > > + /* If ibmvnic_reset() is scheduling a reset, wait for it to > + * finish. Then prevent it from scheduling any more resets > + * and have the reset functions ignore any resets that have > + * already been scheduled. > + */ > + spin_lock_irqsave(&adapter->remove_lock, flags); > adapter->state = VNIC_REMOVING; > + spin_unlock_irqrestore(&adapter->remove_lock, flags); > +
Why irqsave/restore variants ? are you expecting this spinlock to be held in interruptcontext ? > spin_unlock_irqrestore(&adapter->state_lock, flags); > > flush_work(&adapter->ibmvnic_reset); > diff --git a/drivers/net/ethernet/ibm/ibmvnic.h > b/drivers/net/ethernet/ibm/ibmvnic.h > index 1179a95a3f92..2779696ade09 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.h > +++ b/drivers/net/ethernet/ibm/ibmvnic.h > @@ -1081,6 +1081,8 @@ struct ibmvnic_adapter { > spinlock_t rwi_lock; > enum ibmvnic_reset_reason pending_resets[VNIC_RESET_MAX-1]; > short next_reset; > + /* serialize ibmvnic_reset() and ibmvnic_remove() */ > + spinlock_t remove_lock; > struct work_struct ibmvnic_reset; > struct delayed_work ibmvnic_delayed_reset; > unsigned long resetting;