https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

            Bug ID: 99953
           Summary: In AVX, SIMD support environment, strlen performance
                    without optimization is 3 times faster than optimized
                    strlen function.
           Product: gcc
           Version: 9.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: novemberizing at gmail dot com
  Target Milestone: ---

Created attachment 50519
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50519&action=edit
the preprocessed file

I tested the performance of 65K bytes string and 65536 times for each O0, O1,
O2, O3, and the related performance was not optimized as shown below. If it is
not optimized, it has been confirmed that glibc@strlen_avx is called.

$ gcc -Wall -Wextra -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations  -fsanitize=undefined -save-temps strlen.c

$ ./a.out 
no optimize =>  0.000007655
o1 optimize =>  0.000062935
o2 optimize =>  0.000022461
o3 optimize =>  0.000023192

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
9.3.0-17ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-9
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

Reply via email to