On Tue, 2015-12-08 at 17:02 +0000, Yuval Mintz wrote:
> > Under heavy TX load, bnx2x_poll() can loop forever and trigger soft lockup 
> > bugs.
> > 
> > A napi poll handler must yield after one TX completion round, risk of 
> > livelock is
> > too high otherwise.
> > 
> > Bug is very easy to trigger using a debug build, and udp flood, because of 
> > added
> > cpu cycles in TX completion, and we do not receive enough packets to break 
> > the
> > loop.
> 
> Eric - I understand what you're doing and it looks fine [to me, at least].
> Out of curiosity, do you know whether removing the loop damages any
> other flow, i.e., by slowing transmitter as transmission rings gets filled
> completely between consecutive NAPI runs? 

I saw no downsides yet. Most of the time TX are blocked by BQL these
days, before complete TX ring filling.

I added some instrumentation, and even after the patch and a non debug
kernel we can see :

bnx2x: bnx2x_poll() took 455036 nsec

455 usec on one bnx2x_poll() is already quite big, but one can tweak TX
ring (ethtool -G eth1 tx ....) if latencies are a serious concern.

Note that my patch should not slow transmitters, only give a chance for
other softirqs being serviced, and eventually give control to ksoftirqd
under stress.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to