On 1/19/2017 12:10 PM, Bruce Richardson wrote: > On Tue, Jan 17, 2017 at 02:38:20PM +0100, Olivier Matz wrote: >> Hi Bruce, >> >> Maybe it's worth checking the impact. The size check could be done only >> once per bulk, so it may not cost that much. >> >> It's also possible to have a particular case for pointer size, and >> use a memcpy for other sizes. >> >> > <snip> >> I think having a performance test showing storing the elt size in the >> ring structure has a deep impact would help to reach a consensus >> faster :) >> >> > Hi Olivier, > > I did a quick prototype using a switch statement for three data element > sizes: 8, 16, and 32 bytes. The performance difference was neglible to > none. In most cases, with ring_perf_autotest on my system, there was a > small degradation, of less than 1 cycle per packet, and a few were > slightly faster, probably due to the natural variation in results > between runs. I did not test with any memcpy calls in the datapath, all > assignments were done using uint64_t's or vectors of the appropriate > sizes. > > Therefore it looks like some kind of solution without macros and using a > stored element size is possible. However, I think there is a third > alternative too. It is outlined below as option 3. > > 1. Use macros as in original RFC > > 2. Update rte_ring like I did for tests described above so that > create takes the size parameter, and the switch statment in > enqueue and dequeue looks that up at runtime. > This means that rte_ring becomes the type used for all > transfers of all sizes. Also, enqueues/dequeue functions take > void * or const void * obj_table parameters rather than void > ** and void * const * obj_table. > Downside, this would change the ring API and ABI, and the > ring maintains no type information > > 3. Update rte_ring as above but rename it to rte_common_ring, > and have the element size parameter passed to enqueue and > dequeue functions too - allowing the compiler to optimise the > switch out. Then we update the existing rte_ring to use the > rte_common_ring calls, passing in sizeof(void *) as parameter > to each common call. An event-ring type, or any other ring > types can similarly be written using common ring code, and > present the appropriate type information on enqueue/dequeue > to the apps using them. > Downside: more code to maintain, and more specialised APIs. > > Personally, because I like having type-specialised code, I prefer the > third option. It also gives us the ability to change the common code > without affecting the API/ABI of the rings [which could be updated later > after a proper deprecation period, if we want].
+1 for third option. > > An example of a change I have in mind for this common code would be some > rework around the watermarks support. While the watermarks support is > useful, for the event_rings we actually need more information provided > from enqueue. To that end, I would see the common_rings code changed so > that the enqueue function returns an additional parameter of the > amount of space left in the ring. This information is computed by the > function anyway, and can therefore be efficiently returned by the calls. > For sp_enqueue, this extra parameter would allow the app to know the > minimum number of elements which can be successfully enqueued to the > ring in a subsequent call. The existing rte_ring code can use the return > value to calculate itself if the watermark is exceeded and return a > value as it does now. Other ring types can then decide themselves if > they want to provide watermark functionality, or what their API set > would be - though it's probably best to keep the APIs consistent. > > Further thoughts? > /Bruce >