On Thu, Feb 9, 2017 at 8:41 AM, Tariq Toukan <ttoukan.li...@gmail.com> wrote: > Hi Eric, > > Thanks again for your series. > > On 09/02/2017 3:58 PM, Eric Dumazet wrote: > > As mentioned half a year ago, we better switch mlx4 driver to order-0 > allocations and page recycling. > > This reduces vulnerability surface thanks to better skb->truesize > tracking and provides better performance in most cases. > > v2 provides an ethtool -S new counter (rx_alloc_pages) and > code factorization, plus Tariq fix. > > I see that you made significant changes to the previous series, especially > patch 14 (RX CQE processing). > Please notice that our work week has just finished here in Israel. > I will review the series, especially the new patches (10 to 14), on Sunday. > > We need to test this series again in our functional and performance > regression systems. > It will be running during the weekend, so we can analyze the results and > update you on Sunday. > > Previous performance results showed a degradation, especially in: > - TCP single stream at 64KB length.
What RX ring size are you using ? I have not seen this at all. > - TCP 16 streams at 1KB length. TCP does not really care, it coalesces all these into TSO skbs, full size... > > This was probably because cache was too short, and many page allocations > were needed. > In CX4, we saw the same kind of degradation, much clearer and amplified as > it's 2.5 times faster (100G). > > Regards, > Tariq Toukan