From: Felix Manlunas <felix.manlu...@cavium.com> Date: Mon, 6 Mar 2017 18:45:59 -0800
> Improve UDP TX performance by: > * reducing the ring size from 2K to 512 > * replacing the numerous streaming DMA allocations for info buffers and > gather lists with one large consistent DMA allocation per ring I applied this, because one should always use consistent mappings for descriptor arrays and things like this, however I am confused about this: > Also, finding an empty entry (in rbtree of device domain's iova mapping in > kernel) during Tx path becomes a bottleneck every so often; the loop to > find the empty entry goes through over 40K iterations; A properly balanced tree should not require 40K iterations for a search.