On 11 June 2012 02:14, Michael Hope <michael.h...@linaro.org> wrote: > We talked at Connect about finishing up the cortex-strings work by > upstreaming them into Bionic, Newlib, and GLIBC. I've written up one > of our standard 'Output' pages: > > https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/CortexStrings > > with a summary of what we did, what else exists, benchmark results, > and next steps. This can be used to justify the routines to the > different upstreams. > > The Android guys are going to upstream these to Bionic. I need a > volunteer to do Newlib and GLIBC. > > One surprise was that the Newlib plain C routines are very good on > strings - probably due to a good end of string detector.
Those graphs end at 4k, which is well within even L1 cache. How do these functions compare for sizes that hit L2 or external memory? I would expect functions doing some prefetching to perform better there. Some time ago, I compared a few memcpy() implementations on large blocks, and the Bionic NEON-optimised one was several times faster than glibc. It is of course possible that glibc has improved since then. -- Mans Rullgard / mru _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain