> > Thinking about this more - why does this patch help some benchmarks? > > The amount of work it takes for the hardware to generate a completion > > is likely negligeable, and we still are scanning the same amount > > of TX WRs in a loop to unmap/free them. > > This makes sense but I think you should also consider the fact that > the tx_lock is taken once per per tx_completion so, with the patch, > the driver spends less time under lock.
Try removing tx_lock from completion path just for the fun of it. I think you'll find it gains you 5% tops. -- MST _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
