> > > Under heavy TX load, bnx2x_poll() can loop forever and trigger soft lockup > bugs. > > > > > > A napi poll handler must yield after one TX completion round, risk > > > of livelock is too high otherwise. > > > > > > Bug is very easy to trigger using a debug build, and udp flood, > > > because of added cpu cycles in TX completion, and we do not receive > > > enough packets to break the loop. > > > > Eric - I understand what you're doing and it looks fine [to me, at least]. > > Out of curiosity, do you know whether removing the loop damages any > > other flow, i.e., by slowing transmitter as transmission rings gets > > filled completely between consecutive NAPI runs? > > I saw no downsides yet. Most of the time TX are blocked by BQL these days, > before complete TX ring filling. > > I added some instrumentation, and even after the patch and a non-debug kernel > we can see : > > bnx2x: bnx2x_poll() took 455036 nsec > > 455 usec on one bnx2x_poll() is already quite big, but one can tweak TX ring > (ethtool -G eth1 tx ....) if latencies are a serious concern. > > Note that my patch should not slow transmitters, only give a chance for other > softirqs being serviced, and eventually give control to ksoftirqd under > stress. >
Cool. Thanks Eric. Acked-by: Yuval Mintz <yuval.mi...@qlogic.com>