On Tue, 2015-12-08 at 17:02 +0000, Yuval Mintz wrote: > > Under heavy TX load, bnx2x_poll() can loop forever and trigger soft lockup > > bugs. > > > > A napi poll handler must yield after one TX completion round, risk of > > livelock is > > too high otherwise. > > > > Bug is very easy to trigger using a debug build, and udp flood, because of > > added > > cpu cycles in TX completion, and we do not receive enough packets to break > > the > > loop. > > Eric - I understand what you're doing and it looks fine [to me, at least]. > Out of curiosity, do you know whether removing the loop damages any > other flow, i.e., by slowing transmitter as transmission rings gets filled > completely between consecutive NAPI runs?
I saw no downsides yet. Most of the time TX are blocked by BQL these days, before complete TX ring filling. I added some instrumentation, and even after the patch and a non debug kernel we can see : bnx2x: bnx2x_poll() took 455036 nsec 455 usec on one bnx2x_poll() is already quite big, but one can tweak TX ring (ethtool -G eth1 tx ....) if latencies are a serious concern. Note that my patch should not slow transmitters, only give a chance for other softirqs being serviced, and eventually give control to ksoftirqd under stress. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html