> On Wed, Sep 28, 2011 at 04:41:47AM -0700, Andi Kleen wrote: > > Michael Zolotukhin <michael.v.zolotuk...@gmail.com> writes: > > > > > > Build and 'make check' was tested. > > > > Could you expand a bit on the performance benefits? Where does it help? > > Especially when glibc these days has very well optimized implementations > tuned for various CPUs and it is very unlikely beneficial to inline > memcpy/memset if they aren't really short or have unknown number of > iterations.
I guess we should update the expansion tables so we produce function calls more often. I will look how things behave on my setup. Do you know glibc version numbers when the optimized string functions was introduced? Concerning inline SSE, I think it makes a lot of sense when we know size & alignment so we can output just few SSE moves instead of more integer moves. We definitely need some numbers for the loop variants. Honza