On 11 June 2012 02:14, Michael Hope <michael.h...@linaro.org> wrote:
> We talked at Connect about finishing up the cortex-strings work by
> upstreaming them into Bionic, Newlib, and GLIBC.  I've written up one
> of our standard 'Output' pages:
>
>  https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/CortexStrings
>
> with a summary of what we did, what else exists, benchmark results,
> and next steps.  This can be used to justify the routines to the
> different upstreams.
>
> The Android guys are going to upstream these to Bionic.  I need a
> volunteer to do Newlib and GLIBC.
>
> One surprise was that the Newlib plain C routines are very good on
> strings - probably due to a good end of string detector.

Those graphs end at 4k, which is well within even L1 cache.  How do
these functions compare for sizes that hit L2 or external memory?
I would expect functions doing some prefetching to perform better
there.  Some time ago, I compared a few memcpy() implementations
on large blocks, and the Bionic NEON-optimised one was several
times faster than glibc.  It is of course possible that glibc has
improved since then.

-- 
Mans Rullgard / mru

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to