https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953
Bug ID: 99953
Summary: In AVX, SIMD support environment, strlen performance
without optimization is 3 times faster than optimized
strlen function.
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: novemberizing at gmail dot com
Target Milestone: ---
Created attachment 50519
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50519&action=edit
the preprocessed file
I tested the performance of 65K bytes string and 65536 times for each O0, O1,
O2, O3, and the related performance was not optimized as shown below. If it is
not optimized, it has been confirmed that glibc@strlen_avx is called.
$ gcc -Wall -Wextra -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations -fsanitize=undefined -save-temps strlen.c
$ ./a.out
no optimize => 0.07655
o1 optimize => 0.62935
o2 optimize => 0.22461
o3 optimize => 0.23192
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
9.3.0-17ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-9
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)