Re: [PATCH][AARCH64] inline strlen for 8-bytes aligned strings

2018-08-10 Thread Anton Youdkevitch
Wilco, On 10.08.2018 18:04, Wilco Dijkstra wrote: Hi, A quick benchmark shows it's faster up to about 10 bytes, but after that it becomes extremely slow. At 16 bytes it's already 2.5 times slower and for larger sizes its over 13 times slower than the GLIBC implementation... The implementati

Re: [PATCH][AARCH64] inline strlen for 8-bytes aligned strings

2018-08-10 Thread Wilco Dijkstra
Hi, A quick benchmark shows it's faster up to about 10 bytes, but after that it becomes extremely slow. At 16 bytes it's already 2.5 times slower and for larger sizes its over 13 times slower than the GLIBC implementation... > The implementation falls back to the library call if the > string is

Re: [PATCH][AARCH64] inline strlen for 8-bytes aligned strings

2018-08-10 Thread Anton Youdkevitch
Richard, On 10.08.2018 16:54, Richard Earnshaw (lists) wrote: On 10/08/18 14:38, Anton Youdkevitch wrote: The patch inlines strlen for 8-byte aligned strings on AARCH64 like it's done on other platforms (power, s390). The implementation falls back to the library call if the string is not aligne

Re: [PATCH][AARCH64] inline strlen for 8-bytes aligned strings

2018-08-10 Thread Richard Earnshaw (lists)
On 10/08/18 14:38, Anton Youdkevitch wrote: > The patch inlines strlen for 8-byte aligned strings on > AARCH64 like it's done on other platforms (power, s390). > The implementation falls back to the library call if the > string is not aligned. Synthetic testing on Cavium T88 > and Cavium T99 showed

[PATCH][AARCH64] inline strlen for 8-bytes aligned strings

2018-08-10 Thread Anton Youdkevitch
The patch inlines strlen for 8-byte aligned strings on AARCH64 like it's done on other platforms (power, s390). The implementation falls back to the library call if the string is not aligned. Synthetic testing on Cavium T88 and Cavium T99 showed the following performance gains: T99: up to 8 bytes: