Re: [EXT] [PATCH net-next 16/16] qlge: Refill empty buffer queues from wq

Benjamin Poirier Tue, 09 Jul 2019 18:18:56 -0700

On 2019/06/27 14:18, Manish Chopra wrote:
> > -----Original Message-----
> > From: Benjamin Poirier <bpoir...@suse.com>
> > Sent: Monday, June 17, 2019 1:19 PM
> > To: Manish Chopra <mani...@marvell.com>; GR-Linux-NIC-Dev <GR-Linux-
> > nic-...@marvell.com>; netdev@vger.kernel.org
> > Subject: [EXT] [PATCH net-next 16/16] qlge: Refill empty buffer queues from
> > wq
> > 
> > External Email
> > 
> > ----------------------------------------------------------------------
> > When operating at mtu 9000, qlge does order-1 allocations for rx buffers in
> > atomic context. This is especially unreliable when free memory is low or
> > fragmented. Add an approach similar to commit 3161e453e496 ("virtio: net
> > refill on out-of-memory") to qlge so that the device doesn't lock up if 
> > there
> > are allocation failures.
> > 
[...]
> > +
> > +static void ql_update_buffer_queues(struct rx_ring *rx_ring, gfp_t gfp,
> > +                               unsigned long delay)
> > +{
> > +   bool sbq_fail, lbq_fail;
> > +
> > +   sbq_fail = !!qlge_refill_bq(&rx_ring->sbq, gfp);
> > +   lbq_fail = !!qlge_refill_bq(&rx_ring->lbq, gfp);
> > +
> > +   /* Minimum number of buffers needed to be able to receive at least
> > one
> > +    * frame of any format:
> > +    * sbq: 1 for header + 1 for data
> > +    * lbq: mtu 9000 / lb size
> > +    * Below this, the queue might stall.
> > +    */
> > +   if ((sbq_fail && QLGE_BQ_HW_OWNED(&rx_ring->sbq) < 2) ||
> > +       (lbq_fail && QLGE_BQ_HW_OWNED(&rx_ring->lbq) <
> > +        DIV_ROUND_UP(9000, LARGE_BUFFER_MAX_SIZE)))
> > +           /* Allocations can take a long time in certain cases (ex.
> > +            * reclaim). Therefore, use a workqueue for long-running
> > +            * work items.
> > +            */
> > +           queue_delayed_work_on(smp_processor_id(),
> > system_long_wq,
> > +                                 &rx_ring->refill_work, delay);
> >  }
> > 
> 
> This is probably going to mess up when at the interface load time 
> (qlge_open()) allocation failure occurs, in such cases we don't really want 
> to re-try allocations
> using refill_work but rather simply fail the interface load.


Why would you want to turn a recoverable failure into a fatal failure?

In case of allocation failure at ndo_open time, allocations are retried
later from a workqueue. Meanwhile, the device can use the available rx
buffers (if any could be allocated at all).

> Just to make sure here in such cases it shouldn't lead to kernel panic etc. 
> while completing qlge_open() and
> leaving refill_work executing in background. Or probably handle such 
> allocation failures from the napi context and schedule refill_work from there.
> 

I've just tested allocation failures at open time and didn't find
problems; with mtu 9000, using bcc, for example:
tools/inject.py -P 0.5 -c 100 alloc_page "should_fail_alloc_page(gfp_t 
gfp_mask, unsigned int order) (order == 1) => qlge_refill_bq()"

What exact scenario do you have in mind that's going to lead to
problems? Please try it out and describe it precisely.

Re: [EXT] [PATCH net-next 16/16] qlge: Refill empty buffer queues from wq

Reply via email to