On Fri, Nov 24, 2017 at 4:05 AM, Saeed Mahameed <sae...@dev.mellanox.co.il> wrote: > On Sun, Nov 5, 2017 at 9:44 PM, Andy Gospodarek <a...@greyhouse.net> wrote: >> From: Andy Gospodarek <go...@broadcom.com> >> >> This RFC converts the adaptive interrupt moderation library from the >> mlx5_en driver into a library so it can be used by any driver. The last >> patch in this set adds support for interrupt moderation in the bnxt_en >> driver. >> >> The main purpose of this code in the mlx5_en driver is to allow an >> administrator to make sure that default coalesce settings are optimized >> for low latency, but quickly adapt to handle high throughput traffic and >> optimize how many packets are received during each napi poll. >> >> For any new driver the following changes would be needed to use this >> library: >> >> - add elements in ring struct to track items needed by this library >> - create function that can be called to actually set coalesce settings >> for the driver >> >> My main reason for making this an RFC is that I would like verification >> from Mellanox that the performance of their driver does not change in a >> unintended way. I did some basic testing (netperf) and did not note a >> statistically significant change in throughput or CPU utilization before >> and after this set. >> >> Credit to Rob Rice and Lee Reed for doing some of the initial proof of >> concept and testing for this patch. > > Hi Andy, > > Following our conversation in netdev 2.2, i would like to suggest the > following: > > Instead of introducing a new API which demands from the driver to > provide callbacks and function pointers to the adaptive moderation > logic, which might be called on every irq interrupt, and to avoid > performance hit, we can move the generic code and the core adaptive > moderation logic to a header file. >
I would like also to suggesting adding Tal Gilboa, as the official maintainer for this new file. as he is the current maintainer and the co-author of this feature in mlx5. > the mlx5e am logic and data structures are already written in a very > modular way and can be stripped out of mlx5e fairly easily. > And i would like to suggest to do it in the following manner: > > 1. naming convention: > I would like to change the generic code naming convention to have the > words DIM (Dynamically-Tuned Interrupt Moderation) instead of mlx5e_am > or am, Following our public blog [1] of the matter and the official > name we prefer for this feature. > > [1] https://community.mellanox.com/docs/DOC-2511 > > Suggested naming convention instead of rx_am: net_dim (DIM for net > applications). > As the rx_am or (dim) logic can be applied to other applications. > > 2. Data types: > > All below mlx5e am data types can be used as is as they hold nothing > mlx5 related. > > struct mlx5e_rx_am_sample > - Holds the current stats sample with ktime stamp > - rename to: net_dim_sample > > struct mlx5e_rx_am_stats > - Holds the needed stats (delta) calculation of last 2 samples > - rename to: net_dim_stats > > struct mlx5e_rx_am > - Adaptive moderation handle > - rename to: net_dim > > 3. static inline generic functions API (based on the usage from > mlx5e_rx_am function) > > //Make a DIM measurement: > net_dim_sample(struct *net_dim_sample sample, packets, bytes, event_ctr) > - previously mlx5e_am_sample() > - Fills a sample struct with the provided stats and the current timestamp > > > //start a new DIM measurement and handles the DIM state machine initial state: > net_dim_start_sample(struct *net_dim rx_dim) > - Makes a new measurement > - stores it into rx_dim->start > - rx_dim->state = DIM_MEASURE_IN_PROGRESS > > > // Takes a new sample (curr_sample) and makes the decision (handles > DIM_MEASURE_IN_PROGRESS state) > net_dim_decision(struct *net_dim rx_dim, curr_sample) > - previously mlx5e_am_decision > - Note, instead of providing the current_stats (delta between start > and current_sample) I suggest to provide the current_sample and move > the stats calculation logic into net_dime_decision. > - All the logic in this function will move to the generic code. > > 4. Driver implementation: (according to the above suggested API) > - Driver should initialize struct net_dim rx_dim, and provide a > work function to handle "dim apply new profile" decision. > - in napi_poll driver should implement the rx_dim state machine > using the above API before arming the completion event queues as > follows: > > mlx5e_rx_am: > > void mlx5e_rx_am(struct mlx5e_rq *rq) > { > struct net_dim *rx_dim = &rq->dim; > struct net_dim_sample end_sample; > u16 nevents; > > switch (rx_dim->state) { > case DIM_MEASURE_IN_PROGRESS: > // driver specific pre condition to decide whether to > continue or skip > // Note that here we only sample and don't calc the delta > stats, this logic moved into net_dim_decision > net_dim_sample(rq, &end_sample, rq->packets, rq->bytes, > cq->events); > if (net_dim_decision(rx_dim, &end_sample)) { > rx_dim->state = DIM_APPLY_NEW_PROFILE; > schedule_work(&rx_dim->work); > } > /* fall through */ > case DIM_START_MEASURE: > net_dim_start_sample(rx_dim); > break; > case DIM_APPLY_NEW_PROFILE: > break; > } > > Thanks, > Saeed. >