On Mon Jun 29, 2020 at 3:07 PM CEST, David Miller wrote: > From: Tobias Waldekranz <tob...@waldekranz.com> > Date: Mon, 29 Jun 2020 21:16:01 +0200 > > > In the ISR, we poll the event register for the queues in need of > > service and then enter polled mode. After this point, the event > > register will never be read again until we exit polled mode. > > > > In a scenario where a UDP flow is routed back out through the same > > interface, i.e. "router-on-a-stick" we'll typically only see an rx > > queue event initially. Once we start to process the incoming flow > > we'll be locked polled mode, but we'll never clean the tx rings since > > that event is never caught. > > > > Eventually the netdev watchdog will trip, causing all buffers to be > > dropped and then the process starts over again. > > > > By adding a poll of the active events at each NAPI call, we avoid the > > starvation. > > > > Fixes: 4d494cdc92b3 ("net: fec: change data structure to support > > multiqueue") > > Signed-off-by: Tobias Waldekranz <tob...@waldekranz.com> > > I don't see how this can happen since you process the TX queue > unconditionally every NAPI pass, regardless of what bits you see > set in the IEVENT register. > > Or don't you? Oh, I see, you don't: > > for_each_set_bit(queue_id, &fep->work_tx, FEC_ENET_MAX_TX_QS) { > > That's the problem. Just unconditionally process the TX work regardless > of what is in IEVENT. That whole ->tx_work member and the code that > uses it can just be deleted. fec_enet_collect_events() can just return > a boolean saying whether there is any RX or TX work at all.
Maybe Andy could chime in here, but I think the ->tx_work construction is load bearing. It seems to me like that is the only thing stopping us from trying to process non-existing queues on older versions of the silicon which only has a single queue.