17/09/2018 10:11, Gavin Hu:
> In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
> the do {} while loop as upon failure the old_head will be updated,
> another load is costly and not necessary.
>
> This helps a little on the latency,about 1~5%.
>
> Test result with the patch(two cores):
> SP/SC bulk enq/dequeue (size: 8): 5.64
> MP/MC bulk enq/dequeue (size: 8): 9.58
> SP/SC bulk enq/dequeue (size: 32): 1.98
> MP/MC bulk enq/dequeue (size: 32): 2.30
>
> Fixes: 39368ebfc6 ("ring: introduce C11 memory model barrier option")
> Cc: [email protected]
>
> Signed-off-by: Gavin Hu <[email protected]>
> Reviewed-by: Honnappa Nagarahalli <[email protected]>
> Reviewed-by: Steve Capper <[email protected]>
> Reviewed-by: Ola Liljedahl <[email protected]>
We are missing reviews and acknowledgements on this series.