Re: Benchmarking / justifying cortex-strings

David Gilbert Mon, 05 Sep 2011 02:33:16 -0700

On 5 September 2011 04:21, Michael Hope <michael.h...@linaro.org> wrote:
> On Fri, Sep 2, 2011 at 4:08 PM, Michael Hope <michael.h...@linaro.org> wrote:
>> Hi Dave.  I've been hacking away and have checked in a couple of
>> benchmarking and plotting scripts to lp:cortex-strings.  The current
>> results are at:
>>  http://people.linaro.org/~michaelh/incoming/strings-performance/
>>
>> All are done on an A9.  The results are very incomplete due to how
>> long things take to run.  I'll leave ursa3 doing these over the
>> weekend which should flesh this out for the other routines.
>
> Right, that's done.  The new graphs are up at:
>  http://people.linaro.org/~michaelh/incoming/strings-performance/
>
> The original data is at:
>  http://people.linaro.org/~michaelh/incoming/strings-performance/epic.txt
>
> Here's the relative performance for all routines with eight byte
> aligned data and 128 byte blocks:
>  http://people.linaro.org/~michaelh/incoming/strings-performance/top-000128.png
>
> memchr, memcpy, strcpy, and strlen all look good at this block size.


Good.

> Here's the speed versus block size for eight byte aligned data:
>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-memchr-08.png

Nice; odd dip between 8 and 16 chars - I don't switch to the smarter
stuff until 16 bytes.

>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-memset-08.png

Hmm yes the short ones could be a bit faster - I always tended to use
log X scales :-)
The really small ones I wouldn't worry too much about, the interesting
stuff is 32-512
where I'd have expected it to have got it's act in gear.

>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-strchr-08.png
>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-strchr-08.png

The version of strchr that's in there is the simple-as-possible
strchr; it's byte at a time -
I also have a version that uses similar code to memchr that goes fast
at large sizes
but is slower for small matches:

See:
https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/InitialStrchr?action=AttachFile&do=get&target=panda-01-strchr-git44154ec-strchr-abs.png

I'd made the call that performance at smaller strings was probably
more important.

>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-strcmp-08.png

Huh? I haven't written a strcmp - that looks like newlibs?

>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-strcpy-08.png

Ditto.

>  http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-strlen-08.png

That's very nice - although quite bizarre;  even the lower end of the
steps are suitably
fast so not really anything to worry about; but it  would be great to
understand where
the 1500 cycle difference is going at the large end.

Dave

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Benchmarking / justifying cortex-strings

Reply via email to